RSPQ.ORG

Introduction

We offer speech compression technologies for internet applications.  These technologies enable real-time streaming of audio at 16 KBPS at high qualities without the annoying presence of artifacts.  The goal is to have a speech compression technology that works on all platforms like Microsoft Windows, Linux/Unix and MacIntosh.  There are virtually two players in the field - Microsoft and Real Audio - with different philosophies.  Microsoft's is to encourage using their platforms while Real Audio's is characterized by aggressive marketing and blaming their woes on Microsoft.  Another disturbing factor is that these companies control the entire market and the underlying compression technologies are proprietary and non-public.  In software because of GNU and similar organizations one need not worry about "proprietarization" of compilers and operating systems and denial of the underlying knowledge to the public in the interest of profit and domination.  It is a scary thought when Newton formulated his three principles of mechanics, some big company steps in, claims these principles to be their property, and denies access to the rest of the world.  Fortunately this did not happen at that time and the world used Newton's genius and knowledge to further science and human existence.

You can download executables along with the source code in C/C++ to compress and play at 16 KBPS on Microsoft Windows 98 (I presume it works on 2000 and NT), Linux (Red Hat 7.1, BearOps), and BSD (Open BSD and Free BSD).   The open source code approach will encourage porting of this technology to other platforms like MacIntosh and other non-Intel platforms.  This website features an eclectic collection of musical pieces and the user may judge the quality of the compression for himself.

The speech compression technology is based on the pioneering work of Itakura, Saito, Furui, Atal, Markel, A. Gray, R. Gray, Gersho and other speech researchers in 70's and 80's.  It uses LPCs to find the residual and encode using vector quantization.  There are some novelties like "noisization" before quantization using Gaussian noise waveforms to improve performance.  The compression part requires intensive computation - on a 800 MHz Pentium, the compression is almost real-time, one second of speech requiring 1.5 seconds of computational time.  There can be a lot of work done in terms of improving the quality using line spectral pairs (LSPs), vector quantization with larger vectors (with consequent explosions in the computational requirements) and application to image compression.

Windows Platforms

There are two zip files you can download.   The console program QComCon.zip will allow compression and playing of compressed files on the internet.  There is a Help option which explains what to do.  To play files on the internet, execute QComCon.exe, choose the (S)ite option by pressing the key 's' and type in:

http://www.rspq.org/rspq

You will see a list of files to play.  Choose (N)etplay and dean.wac to play Dean Martin.  For Windows programs, the prefix http:// is needed whereas for Linux you should not use it.  The Windows program QComWin.zip has a GUI and you can only play the files.   Execute QComWin.exe, choose the HTTP Site and press Directory button to get the selections.   Choose a selection and press Play.  The code is in C++ and uses Microsoft Visual C++ 8.0.  If you want to download the directory structure for the compressed files, click here and download rspq~!~dirs.  This is needed only if you want to feature compressed speech/music on your website.

Linux Platforms

Download the zipped tar file qcomRH71.tgz.   Untar it using the command tar -xvzf qcomRH71.tgz. There will be three directories console, glade, library.  If your operating system is Red Hat 7.1 or BearOps, you can run the file qcom under the console directory.  For other Linux systems, you may have to recompile with GNU compiler.  There are makefiles in each directory.  To make files for glade, you need the make from GNU (Ver. 3.79).

The console program will allow compression and playing of compressed files on the internet.  There is a Help option which explains what to do.  To play files on the internet, execute ./qcom, choose the (S)ite option by pressing the key 's' and type in:

www.rspq.org/rspq

You will see a list of files to play.  Choose (N)etplay and dean.wac to play Dean Martin.  For Windows programs, the prefix http:// is needed whereas for Linux you should not use it.

The program qcom in glade/src directory has a GUI and you can only play the files.  Execute ./qcom, choose the HTTP Site and press Directory button to get the selections.  Choose a selection and press Play.  The code uses Glade development tools and GNU C compiler.  If you want to download the directory structure for the compressed files, click here and download rspq~!~dirs.   This is needed only if you want to feature compressed speech/music on your website.

Linux systems provide drivers only for certain speech cards (like Sound Blaster) and you may get a message that your speech driver may not be able to play 16-bit stereo.  In this case you may have to get an OSS/Linux driver from http://www.OpenSound.com for your card, install it and run the above application.  It may cost around $50 and is well worth the effort and frustration to make your card otherwise work.

BSD Platforms

Download the zipped tar file qcomFBSD.tgz for Free BSD or qcomOBSD.tgz for Open BSD.   Untar it using the command tar -xvzf. There will be three directories console, glade, library.  If your operating system is Free BSD or Open BSD, you can run the file qcom under the console directory.  For other BSD systems, you may have to recompile with GNU compiler.  There are makefiles in each directory.  To make files for glade, you need the make from GNU (Ver. 3.79).

The console program will allow compression and playing of compressed files on the internet.  There is a Help option which explains what to do.  To play files on the internet, execute ./qcom, choose the (S)ite option by pressing the key 's' and type in:

www.rspq.org/rspq

You will see a list of files to play.  Choose (N)etplay and dean.wac to play Dean Martin.  For Windows programs, the prefix http:// is needed whereas for Linux you should not use it.

The program qcom in glade/src directory has a GUI and you can only play the files.  Execute ./qcom, choose the HTTP Site and press Directory button to get the selections.  Choose a selection and press Play.  The code uses Glade development tools and GNU C compiler.  If you want to download the directory structure for the compressed files, click here and download rspq~!~dirs.   This is needed only if you want to feature compressed speech/music on your website.

There are two flavors regarding the speech drivers for BSD systems.  The Free BSD system is very much like the Linux system and has a device /dev/dsp for speech input/output.  The Open BSD has /dev/audio for its speech device.  The source code provides for both options - the header file oss.h indicates which system is to be used.  You can always get an OSS/BSD driver from http://www.OpenSound.com for your card, install it and have a /dev/dsp for your speech device.   If you do this, you want to set OSS to 1 in oss.h in the lib directory.  This may cost around $50 and is well worth it in getting the application run without frustration.  The only problem is that the Open BSD may complain about security because of the way the driver is installed.

Version 2

Version 1.0 described above performs a vector quantization of the residual 6-tuple over 16384 points, i.e., it searches 16384 6-tuples to find the best 14-bit representation for a given 6-tuple.  Introducing a notation to represent this search by VQ-14/6, in Version 2 we expand this search to VQ-15/8 and VQ-16/8.  In Version 1.0, each sample is represented by 14/6 = 2.3 bits and in order to accommodate this the residual is filtered and downsampled at 6 KHZ (instead of 8 KHz).  Because of this the resulting output signal loses components in 3 - 4 KHz bandwidth.  In Version 2.0, this downsampling is not performed but search is expanded to 15/8 = 1.88 bits/sample and 16/8 = 2 bits/sample.  Obviously compression takes a longer time and Version 2.0 has a bit rate of 18 KBPS.  Also we use an other feature namely natural cubic splines for interploation of the reconstructed waveform.  You may download for Windows here or for Red Hat Linux here.  There are samples available at www.rspq.org/rspq2 and help within the program tells you the usage.  Presently we are working on VQ-21/12 to get a 16 KBPS rate.  In this connection statisticians interested in vector quantization may look into the open letter.  Research proposal indicating directions for future research and presenting preliminary results can be found at http://www.rspq.org/pubs.

Licensing and Contributions

You can freely use this software for personal and non-commercial use and develop software like Active X controls, plug ins and players.   If you developed applications or featured this technology on your website, let me know so that I can put links to your products or website.  See the License or download the  License File for details regarding open source code licensing.  Contributions and corporate sponsorships are always welcome since the funds will be used to advance compression technology in speech and image processing and maintain this website.  Direct your comments to ReddiSS@AOL.COM.

(C)opyright 2002.  RSPQ.ORG