Search


Rss feeds
Posts Comments Source Code
Rating
Image to SpectrogramNontransitive DicecheaTorrentDomain ColoringiMac G5 CPU Fan view all...
Recent
Pretty GraphHypernova EngineEmergent FeedbackSDL Euclid OrchardSingularity Viewer view all...
Tags

All source code released under the BSD License unless otherwise specified
© 2010, Gavin Black

Image to Spectrogram

Overview

This project will take a digital picture and convert it into a wave file. Creating a spectrogram of that file then reproduces that picture.

Algorithm

Treat the image as a matrix of pixels with r,g,b values. Go through 1 column of the matrix for a given time interval and use the following formula to get a suitable frequency:

max_frequency - (y/picture_height)*max_frequency

We can get some coloration by changing the decibels for each frequency based on the pixel RGB values. Decibel level is division of an exponent of 10 following the form:

decibalN = 10 ^ ((r + g + b)/ max_sum_rgb )

All frequencies for a column are merged using:

(sin(2*pi + sample1)/decibel1 + sin(2*pi + sample2)/decibel2 ...
   + sin(2*pi + sampleN)/decibelN) / N

Results

Mp3 of generated sound

Spectrogram for mp3 of Hello World picture

Note that color is just treated as a grayscale, since there is only 1 dimension to decibel level. So it really is treating it like this:

Mp3 of generated sound

Spectrogram of the mp3 generated from the Rainbow pic

Mp3 of generated sound

Mandelbrot set spectrogram
Initial fractal image used to generate the mp3

Mp3 of generated sound

Spectrogram of mp3 for the fractal
Edit: Following the suggestion below the picture was generated with the following command: sox fract.wav -n spectrogram -x 4 -Z -60 -z 60
Better parameters used to generate this image

Downloads and Execution

imageSpectrogam.pl -- The main Perl script. Will handle png, jpg, or gif files, and requires the extension is present in the name. Simplest way to run it is with ./imageSpectrogram.pl image_filename.

There are 2 Perl CPAN dependencies needed:
  • cpan Audio::Wav
  • cpan GD
Also to get a spectrogram the incredibly versatile SoX tool works well. Just run:
sox filename.wav -n spectrogram
To produce a spectrogram.png image.

Conclusion

I originally tried just doing a bash script with SoX calls, but figured out that it doesn't handle merging 100+ frequencies. But after I found this page showing how add a frequency in Perl, it was pretty simple. Overall it works as expected, but is horribly slow on larger images with lots of color.



Last Edited: 2010-10-25 03:23:21

+ Add a comment


Hadayamaku said (2010-10-25 03:25:56):
Great!

Brent Fisher said (2010-10-25 03:25:33):
For those writing a program to generate audio whose spectrogram is an arbitrary
image, I should tell you that in order for the resulting sound wave to sound
pleasant to the ear, you need to set the starting phase of each frequency to a
random degree/radian value. Then leave the phase alone. I tried a similar thing
in C programming using an FFT and was getting "click-click-click" sounds which
sounded awful. The phase needs to be randomized at the start of the signal. Once
I figured that one out it sounded as good as the commercial image to
spectral-audio programs.

Gavin Black said (2010-10-25 03:25:01):
DM DOKURO, you need to install the module (Audio::Wav) from cpan
(http://search.cpan.org/dist/Audio-Wav/Wav.pm) if you haven't already

If you're Linux/Unix/Cygwin/OSX try (As root):
cpan -i Audio::Wav

You will also need GD (cpan -i GD), these are the only 2 dependencies.

You can e-mail me if you need more help. It's listed in the about section (Not
putting it in the comments to prevent spam to the address).

DM DOKURO said (2010-10-25 03:24:38):
Err... It says that there's an error near Line 3 (Audio::wav). What does this
usually mean?

Gavin Black said (2010-10-25 03:24:11):
Excellent suggestion! I've updated the post with a new picture.

loudness said (2010-10-25 03:23:51):
I wonder if the last pic would benefit from a little more contrast: maybe spectrogram options something like -Z -60 -z 60 ?