There are many
different Audio File Formats. Some are historical, some are proprietary,
some are our own. These usually grow or change randomly, and new ones appear.
We can use tools provided, updated, or withdrawn at other peoples discretion.
Alternatively we can use our own tools, which can be changed to suit our
needs and are not dictated by other peoples requirements. We have the source
code and a competent programmer can use or change these to suit our needs.
Sometimes a script may be an alternative, which will be determined by execution
speed and versatility.
To make these routines easy to use, they are modular, perform only one function, are self descriptive, have built in debugging and take a variety of inputs. They are updated from time to time as needs dictate.
These routines grew from a need to play audio files in 1990 on the Sun workstation under the SunOs 4.1 operating system and were extended to Solaris and other Sun architectures. Our first Sun had a primitive audio device that performed poorly, did nothing like the the maker claimed, and indeed the maker didn't know what their audio device did. The routines were then extended to bypass the restrictive licensing systems imposed by software companies.
The existing routines read WAVE, ESPS, SSFF, AU, and TEXT file headers and write ESPS and WAVE file headers. There is an additional routine that determines the file type. The routines have self explanatory names and are:
read_au_file_header
read_esps_file_header
read_ssff_file_header
read_text_file_header
read_wav_file_header
read_file_type
write_esps_sd_file_header
write_wav_file_header
More routines may be added as required.
If you wish to change these routines, add a comment at the top to declare your change, the author, and the date. Use the compile script in the same directory. It has the same name as the routine with a c prepended. For example, if you change the read_file_type.c routine, then use the cread_file_type script to compile it. It is a good idea to rename a copy of the old file, or to work in your own directory, just in case an unrecoverable error is made.
When used, the read routine will open the file, analyse the header, close the file, and return the information. No attempt was made to keep the file open, as returning file pointers is less flexible, narrows the programmers choice, and incurs no speed penalty with modern computers and operating systems, as the files are usually cached and a further disk access and delay does not occur.
The routines are called in this way:
read_wav_file_header (arg1, arg2, arg3, arg4, arg5);
Where the first argument is the file name, the second is a buffer, and the following arguments are header dependent. The last argument is the debug flag. The arguments are not changed in size or length but are merely extracted from the header and returned as they are. The programmer may change them or not, as required.
The routines allow files to be opened in the normal way, or for files to be piped to the main program. This complicates matters a little. When using a pipe, the file is read character by character from Stdin. When the read_file_type routine is called, it processes the first 32 characters of the file to determine the type. Is is difficult to rewind the Stdin file, so this information is stored in a buffer and returned by the routine. This buffer can then be passed to the other read header routines for processing. In this way the information is preserved. Argument two is the buffer, and is only used for pipes.
The debug flag is normally set to zero and the routine performs silently, unless there is an error. If the debug flag is set to one, then the analysis is reported step by step and the information can be examined for errors. Each line of information also contains the routine from which the debugging information comes from. The debug flag can be controlled from the command line, providing the calling routine passes it on. The routines will report information like:
read_wav_file_header:
opened file speech8.wav
read_wav_file_header:
read 12 bytes from riff chunk
read_wav_file_header:
found RIFF id ok
read_wav_file_header:
found WAVE id ok
read_wav_file_header:
riff length is 26878
read_wav_file_header:
found (fmt ) chunk of length 16
read_wav_file_header:
format length is 16
read_wav_file_header:
data type is 1
read_wav_file_header:
number of channels is 1
read_wav_file_header:
sample rate is 8000
read_file_type (filename, &type, buf, debug);
These are declared
in the calling program as:
char *filename;
unsigned char buf[32];
int type, debug;
where:
The first argument
called filename is the name of the file to be examined. It can contain
a valid file name like speech.sd or a pipe name like Stdin.
The second argument
called type is initially empty and the type of file is returned. It will
return a number where: 3 = au, 4 = wave, 7 = ESPS, 12 = text, 13 = ssff,
or 0 = unknown. Future file types will be: 1= headerless, 2=
AFsp, 5 = AIFF-C, 6 = NIST SPHERE, 8 = IRCAM, 9 = SPPACK, 10
= INRS-Telecom, and 11 = AIFF.
The third argument
called buf is initially empty and is returned empty for a normal file call.
It is returned filled with the first 32 characters of the file upon return
from a Stdin filename call.
The fourth argument
called debug is used to report analysis information. Set it to zero for
normal silent operation. Setting it to one to turn on debugging messages,
may result in output like:
read_file_type: opened
file bong.au
read_file_type: read
32 bytes
read_file_type: found
AU id
Put debugging code
in the calling program, which can be activated from the command line,
for example:
if (debug==1) { if(type==11)
printf("AIFF audio file type\n"); }
as this will complement
the debugging code in the routine, and will aid debugging during program
writing and error diagnosis in the future.
read_au_file_header (filename, buf, &offset, &frequency, &channels, &data_length, &coding, debug);
These are declared
in the calling program as:
char *filename;
unsigned char buf[32];
int offset, frequency,
channels, data_length, coding, debug;
where:
The first argument
called filename is the name of the file to be examined. It can contain
a valid file name like speech.au or a pipe name like Stdin.
The second argument
called buf is empty for a normal file call and this routine ignores it.
If this routine is called after determining the file type from a pipe or
Stdin filename, it then contains the first 32 chars of the file, and the
buffer contents will be the first part of the header analysed.
In the third argument
called offset is returned the number of bytes to skip to the first data
byte in the file. It is actually the header length.
In the fourth argument
called frequency is returned the sample frequency of the data in the audio
file.
In the fifth argument
called channels is returned the number of data channels in the file, which
would normally be 1, or maybe 2 for dual channel files.
In the sixth argument
called data_length is the size of the data portion of the file. This is
optional and so it may sometimes contain zero.
In the seventh argument
called coding is returned the type of data in the file. The type is a number
where: MULAW_8 = 1, LINEAR_8 = 2, LINEAR_16 = 3, LINEAR_24 = 4, LINEAR_32
= 5, AFLOAT = 6, ADOUBLE = 7.
The eighth argument
called debug is used to report analysis information. Set it to zero for
normal silent operation. Setting it to one, to turn on debugging messages,
may result in an output like:
read_au_file_header:
opened file bong.au
read_au_file_header:
format = au file
read_au_file_header:
offset = 48
read_au_file_header:
data length = 12446
read_au_file_header:
coding = 1
read_au_file_header:
frequency = 8000
read_au_file_header:
channels = 1
The standard structure of au file headers is located in /home/apps/s32cdsp/include/audio.h
There is a routine to read the header of an ESPS/WAVES type file. These normally have an extension of .sd but may also have extensions like .d sometimes. The ESPS header is very complicated and may have many headers, as an additional header is added each time an ESPS routine processes the file. This routine will return several different parameters extracted from the header. This routine does not require a license. Call the routine this way:
read_esps_file_header (filename, buf, &offset, &Dfreq, &Dstart, &record_size, &leng, &mach, &nd, &nf, &nl, &ns, &nc, columns, debug);
These are declared
in the calling program as:
char *filename, columns[6000];
unsigned char buf[32];
int offset, record_size,
leng, mach, debug;
double Dfreq, Dstart,
nd, nf, nl, ns, nc;
where:
The first argument
called filename is the name of the file to be examined. It can contain
a valid file name like speech.sd or a pipe name like Stdin.
The second argument
called buf is empty for a normal file call and this routine ignores it.
If this routine is called after determining the file type from a pipe or
Stdin filename, it then contains the first 32 chars of the file, and the
buffer contents will be the first part of the header analysed.
In the third argument
called offset is returned the number of bytes to skip to the first data
byte in the file. It is actually the header length.
In the fourth argument
called Dfreq is returned the sample frequency.
In the fifth argument
called Dstart is returned the start time of the file, which is useful if
the file was created as a slice of another file, as it can be related to
the original file. It is usually zero.
In the sixth argument
called record_size is returned the size of the the data that was sampled
at the sample clock. For example, a single channel 8 bit sampled file would
have a record size of 1, whereas a 10 channel 16 bit sampled file would
have a record size of 20.
In the seventh argument
called leng is returned the number of data records.
In the eighth argument
called mach is returned a number that indicates what hardware was used
to originally record this file. The number can be: MASSCOMP_CODE =
1, SUN3_CODE = 2, CONVEX_CODE = 3, SUN4_CODE = 4, HP300_CODE
= 5, SUN386i_CODE = 6, DS3100_CODE = 7, MACII_CODE = 8, SG_CODE =
9, HP800_CODE = 10, VAX_CODE = 11, DG_AVIION_CODE = 12, APOLLO_68K_CODE
= 13, APOLLO_10000_CODE = 14, HP400_CODE = 15, CRAY_CODE = 16, SONY_RISC_CODE
= 17, SONY_68K_CODE = 18, STARDENT_3000_CODE = 19, IBM_RS6000_CODE
= 20, HP700_CODE = 21, DEC_ALPHA_CODE = 22, SOLARIS_86_CODE = 23,
LINUX_CODE = 24, or UNKNOWN_CODE = 99.
In the ninth argument
called nd is returned the number of data records that contain type double.
In the tenth argument
called nf is returned the number of data records that contain type float.
In the eleventh argument
called nl is returned the number of data records that contain type long.
In the twelfth argument
called ns is returned the number of data records that contain type short.
In the thirteenth
argument called ns is returned the number of data records that contain
type char.
In the fourteenth
argument called columns is returned any text that is found in the header.
The fifteenth argument
called debug is used to report analysis information. Set it to zero for
normal silent operation. Setting it to one, to turn on debugging messages,
may result in an output like:
read_esps_file_header:
opened file speech8.sd
read_esps_file_header:
reading the file preamble
read_esps_file_header:
read 32 bytes from the preamble
read_esps_file_header:
data offset 3333 bytes
read_esps_file_header:
record size 2 bytes
read_esps_file_header:
machine code is 4
read_esps_file_header:
read 3301 bytes from the header
read_esps_file_header:
0 doubles in record
read_esps_file_header:
0 floats in record
read_esps_file_header:
0 longs in record
read_esps_file_header:
1 shorts in record
read_esps_file_header:
0 chars in record
read_esps_file_header:
13421 data records in file
read_esps_file_header:
FT_FEA file type
read_esps_file_header:
frequency=8000.000000
read_esps_file_header:
start time=0.000000
read_esps_file_header:
TYPTXT comment added by parker: There's usually a valve.
read_esps_file_header:
COMMENT sdtofea - speech.sd
read_esps_file_header:
CWD sparc1:/sun4_home2/production/products/esps.sun4/demo
read_esps_file_header:
SOURCE <stdin>
The standard structure of ESPS file headers is located in /usr/esps/include/esps/header.h
WRITING ESPS FILE HEADER
There is a routine to write the header of an ESPS type sd file. The ESPS header is very complicated with many fields of different sizes and types. This routine will fill those fields from the supplied parameters and return the header size. This routine does not require a license. Call the routine this way:
write_esps_sd_file_header (outname, &offset, &size, &frequency, &nrec, debug);
These are declared
in the calling program as:
char outname[20];
int offset, size,
frequency, debug;
long nrec;
where:
The first argument
called filename is the name of the file to be examined. It can contain
a valid file name like speech.sd.
In the second argument
called offset is returned the number of bytes to skip to the first data
byte in the file. It is actually the header length.
The third argument
called size is the size of the data record. It assumes that the data will
always be shorts and thus size will be 2 for a single channel file or 4
for a dual channel file.
The fourth argument
called frequency is the sample frequency.
The fifth argument
called nrec is the number of data records in the file.
The sixth argument
called debug is used to report information. Set it to zero for normal silent
operation. Setting it to one, to turn on debugging messages, may result
in an output like:
write_esps_sd_file_header:
filename=AH.sd
write_esps_sd_file_header:
offset=44
write_esps_sd_file_header:
size=2
write_esps_sd_file_header:
frequency=22050
write_esps_sd_file_header:
number of records=34714
write_esps_sd_file_header:
opened file AH.sd ok
write_esps_sd_file_header:
filling the ESPS FEA file header preamble
write_esps_sd_file_header:
filling the ESPS FEA file dummy header fields
write_esps_sd_file_header:
filling the ESPS FEA file header common
write_esps_sd_file_header:
Wrote ESPS file preamble of 32 bytes
write_esps_sd_file_header:
Wrote ESPS file header of 441 bytes
write_esps_sd_file_header:
offset=473
The standard structure of ESPS file headers is located in /usr/esps/include/esps/header.h
The calling program uses this routine to create a header with all the appropriate fields correctly filled, and no actual data in the file. The calling program will then reopen the file and append the audio data.
There is a routine to read the header of an WAV type file. These normally have an extension of .wav but some NIST-SPHERE files also use this extension. The original WAV files had 3 parts called chunks, the Wave Chunk, Format Chunk, and Data Chunk. Microsoft has added a 4th chunk called the Fact Chunk. The WAV header was originally fairly simple, but many software writers have added undocumented variable size additions and sometimes extended the existing standard chunks. This routine will handle all WAV files of old and new types and also skip any non-standard chunks. It will return several different parameters extracted from the header. Call the routine this way:
read_wav_file_header (filename, buf, &offset, &dtype, &channels, &frequency, &drate, &nbytes, &nbits, &dlength, debug);
These are declared
in the calling program as:
char *filename;
unsigned char buf[32];
int offset, dtype,
channels, frequency, drate, nbytes, nbits, dlength, debug;
where:
The first argument
called filename is the name of the file to be examined. It can contain
a valid file name like speech.wav or a pipe name like Stdin.
The second argument
called buf is empty for a normal file call and this routine ignores it.
If this routine is called after determining the file type from a pipe or
Stdin filename, it then contains the first 32 chars of the file, and the
buffer contents will be the first part of the header analysed.
In the third argument
called offset is returned the number of bytes to skip to the first data
byte in the file. It is actually the header length.
In the fourth argument
called dtype is returned the type of data in the file. The data type can
be: 1 = PCM, 0x0101 = MU_LAW, 0x0102 = A_LAW, 0x0103 = ADPCM.
In the fifth argument
called channels is returned the number of data channels in the file, which
would normally be 1, or maybe 2 for dual channel files.
In the sixth argument
called frequency is returned the sample frequency.
In the seventh argument
called drate is returned the data rate of the file in bytes per second.
In the eighth argument
called nbytes is returned the number of bytes sampled at the clock rate.
In the ninth argument
called nbits is returned the size of the data sample.
In the tenth argument
called dlength is returned the size of the data in the file.
The eleventh argument
called debug is used to report analysis information. Set it to zero for
normal silent operation. Setting it to one, to turn on debugging messages,
may result in an output like:
read_wav_file_header:
opened file newjwing.wav
read_wav_file_header:
read 12 bytes from riff chunk
read_wav_file_header:
found RIFF id ok
read_wav_file_header:
found WAVE id ok
read_wav_file_header:
riff length is 237606
read_wav_file_header:
found (fmt ) chunk of length 16
read_wav_file_header:
format length is 16
read_wav_file_header:
data type is 1
read_wav_file_header:
number of channels is 2
read_wav_file_header:
sample rate is 22050
read_wav_file_header:
data rate is 44100 bytes per second
read_wav_file_header:
bytes per sample is 2
read_wav_file_header:
bits per sample is 8
read_wav_file_header:
found (data) chunk of length 237570
read_wav_file_header:
offset is 44
WRITING A WAV FILE HEADER
There is a routine to write the header of a wav type file. It writes the WAV file with 3 parts called chunks, the Wave Chunk, Format Chunk, and Data Chunk. Call the routine this way:
write_wav_file_header
(outname, &woffset, &dtype, &channels, &frequency, &drate,
&nbytes, &nbits, &dlength, debug);
These are declared
in the calling program as:
char outname[20];
int woffset, dtype,
channels, frequency, drate, nbytes, nbits, debug;
long dlength;
where:
The first argument
called outname is the name of the file to be examined. It can contain a
valid file name like speech.wav.
In the second argument
called woffset is returned the number of bytes to skip to the first data
byte in the file. It is actually the header length.
The third argument
called dtype is the type of coding used in the file. The coding can be:
1 = PCM, 0x0101 = MU_LAW, 0x0102 = A_LAW, 0x0103 = ADPCM.
The fourth argument
called channels is the number of data channels in the file, which would
normally be 1, or maybe 2 for dual channel files.
The fifth argument
called frequency is the sample frequency.
In the sixth argument
called drate is the data rate of the file in bytes per second.
In the seventh argument
called nbytes is the number of bytes sampled at the clock rate.
In the eighth argument
called nbits is the size of the data sample.
In the ninth argument
called dlength is the size of the data in the file.
The tenth argument
called debug is used to report analysis information. Set it to zero for
normal silent operation. Setting it to one, to turn on debugging messages,
may result in an output like:
write_wav_file_header:
filename is speech21.wav
write_wav_file_header:
offset is 44
write_wav_file_header:
format length is 16
write_wav_file_header:
data type is 1
write_wav_file_header:
number of channels is 1
write_wav_file_header:
sample rate is 20000
write_wav_file_header:
data rate is 40000 bytes per second
write_wav_file_header:
bytes per sample is 2
write_wav_file_header:
bits per sample is 16
write_wav_file_header:
data length is 384000
write_wav_file_header:
opened file speech21.wav ok
write_wav_file_header:
filling the riff chunk
write_wav_file_header:
Wrote wav file riff chunk of 12 bytes
write_wav_file_header:
filling the fmt chunk
write_wav_file_header:
Wrote wav file fmt chunk of 24 bytes
write_wav_file_header:
filling the data chunk
write_wav_file_header:
Wrote wav file data chunk of 8 bytes
The calling program uses this routine to create a header with all the appropriate fields correctly filled, and no actual data in the file. The calling program will then reopen the file and append the audio data.
read_ssff_file_header (filename, buf, &offset, &freq, &stime, machine, columns, debug);
These are declared
in the calling program as:
char *filename, columns[6000],
machine[20];
unsigned char buf[32];
int offset, debug;
float freq, stime;
where:
The first argument
called filename is the name of the file to be examined. It can contain
a valid file name like speech.au or a pipe name like Stdin.
The second argument
called buf is empty for a normal file call and this routine ignores it.
If this routine is called after determining the file type from a pipe or
Stdin filename, it then contains the first 32 chars of the file, and the
buffer contents will be the first part of the header analysed.
In the third argument
called offset is returned the number of bytes to skip to the first data
byte in the file. It is actually the header length.
In the fourth argument
called freq is returned the sample frequency of the data in the audio file.
In the fifth argument
called stime is returned the start time of the file, which is useful if
the file was created as a slice of another file, as it can be related to
the original file. It is usually zero.
In the sixth argument
called machine is returned the type of hardware that the file was recorded
on. This will determine if any byte swapping is required.
In the seventh argument
called columns is returned any other text that is found in the header.
The eighth argument
called debug is used to report analysis information. Set it to zero for
normal silent operation. Setting it to one, to turn on debugging messages,
may result in an output like:
read_ssff_file_header:
opened file msajc001.SSFF_sd
read_ssff_file_header:
format = ssff file
read_ssff_file_header:
machine = SPARC
read_ssff_file_header:
start = 0.000000
read_ssff_file_header:
frequency = 20000.000000
read_ssff_file_header:
offset = 177
read_ssff_file_header:
strings = Comment CHAR Created by CONV
Column samples SHORT
1
max_value DOUBLE 13975.000000
This routine may need
to be extended to extract the data type and number of channels.
The standard structure
of SSFF file headers is located in /home/publish/src/mu-plus/SSFF/ssff_formats.h
read_text_file_header (filename, &offset, &freq, debug);
These are declared
in the calling program as:
char *filename;
int offset, debug;
float freq;
where:
The first argument
called filename is the name of the file to be examined. It can contain
a valid file name like speech.ta.
In the second argument
called offset is returned the number of bytes to skip to the first data
byte in the file. It is actually the header length.
In the third argument
called freq is returned the sample frequency of the data in the audio file.
The fourth argument
called debug is used to report analysis information. Set it to zero for
normal silent operation. Setting it to one, to turn on debugging messages,
may result in an output like:
read_text_file_header:
opened file bab.ta
read_text_file_header:
format = audio text file
read_text_file_header:
offset = 918
read_text_file_header:
frequency = 19980.000000
This routine may need to be extended to handle Stdin and adding a buff argument.