Is there an RGB like format for sound?

twitter logo github logo ・1 min read

So I was thinking something meta to myself, "if there is RGB (Red, Green, Blue) for colored lights, is there something similar for sound?", and this got me thinking, now I don't know whenever I look it up I get:

The Color of Sound - Pitch-to-Color Calculator - Flutopedia.com {https://www.flutopedia.com/sound_color.htm}

See Sound Waves Using Colored Light (RGB LED) : 10 Steps ... {https://www.instructables.com/id/See-Sound-Waves-Using-Colored-Light-RGB-LED/}


So is there something like NPS (Note, Pitch, Speed) for sound waves, if not, I propose this as an interesting idea for a music like format for computers.

But what are your thoughts? Has this been done before?
Is this a good idea? Would this idea work?


Thanks for reading,
and have a wonderful day!

twitter logo DISCUSS (15)
markdown guide

I will contribute another point to the discussion, to follow up on @ahferroin7 's comment.

Consider that light generated from RGB through a monitor has a pretty basic quality about it. A 50 pixel square block of blue does not have a lot of variation or nuance. However, when we talk about reflected light, it starts to take on a lot of qualities of the surface off which it was reflected. (Examples: glossy paper, iridescence paint, shadow from curvature). And such nuances are not really possible to represent as just RGB. CSS effects come in later to introduce gradients, blur, etc. but still very limited compared to what we observe in the real world.

Likewise perhaps, if we explore the sound space most of the sounds take on qualities "reflected" off the instruments we use to make them. C on flute sounds very different from C on piano despite being identifiable as the same note. And the sound additionally takes on qualities from the variations between one piano and another.

So probably if we were to come up with an RGB-like formula for sound, it would sound a bit sterile/artificial compared to what we are accustomed to nowadays. It might sound something like this at base. That sample was played through an internal PC Speaker -- the thing on your motherboard that beeps on bootup error. I think it only has about 63 different pitches at base (before you get into PCM to try to eke out more), and they all sound pretty similar except for pitch.


I understand, but my counter argument is,
imagine I had a variable (v) equal to NPS(20,20,20), I mean that would reflect a note wouldn't it? I mean by doing this, we could manually (with a bit of trial and error) recreate the sound of any note on any instrument, could we not?


NPS could give you a sound, maybe adding another number could give you a selection of instruments. But generating any note with accurate tonal qualities of any given instrument is more similar to the problem graphics devs have with generating realistic 3D surfaces. Even with a lot of sophisticated methods, the current state of the art is still a relatively poor approximation of reality. As compared to recorded video of real life. Definitely can't get very close with just 3 or 4 integers.

You could do NPS + Instrument to generate some identifiable sound. But the fidelity would be quite limited. And it would be obviously artificial. But yes it could work!

But then to actually play music, you would also need a way to combine lots of those. It would be more similar to CSS animation of an RGB-filled shape.

Ok! Thanks for your insight, maybe I'd have to rework my proposal similar to what you said NPSI, but what about NPSRV (Note, Pitch, Speed, Range, Vibration)
It would be similar to RGBA, but one, an extra peram, and two the vibration, since, all instruments are is a range of vibrations from range 0 to X, corresponding to a different note and/or octive,
again, thanks for the insight on what you think!

I definitely think it is possible with the caveats mentioned. It seems like a fun idea that could be worth pursuing.

Do you know any easy way to make a library using any built in audio libraries in languages, to word it simpler, are there any programming languages that have built in functions or built in libraries that can let me create audio in a way that let's me express what I was thinking? I would really like to make a library for this now!


Conceptually, what you're talking about is either MIDI, or some form of music tracker. Both are rather old technologies.

MIDI is still used some, but mostly for hardware and not data storage. MIDI data is very space efficient compared to recorded audio, but it has issues with reproducibility. If you take the same MIDI file and play it back on 3 different sound cards, at least one of them is very likely to sound noticeably different from the others. Conceptually, this is similar to how displaying the same image on two different screens will often result in slight differences in color, but it's a lot more noticable than that for most people because it affects more than just the 'color' of the sound.

Music tracker formats solve the issue of reproducibility. A tracker is essentially software for replaying a set of audio samples in a specific sequence. They originated during the microcomputer era and were very popular at the time because MIDI was even less consistent back then than it is now. The downside is that you still have significant inherent quality issues compared to an analog recording. Many low-end trackers would use only one note sample per instrument, and then pitch shift that to get the differing notes, which results in a sound that is more noticeably different from the actual note on a real instrument the further you get from the original sample note's pitch. Even the highest end trackers, however, still sound noticeably different from recorded audio because the notes don't blend into each other the same way they do with real instruments. The last widespread use of trackers was actually speech synthesis (this is why old speech synthesizers sound so bad, they usually had one sample per phoneme and just strung them together in the correct order).

Something else to consider while looking at this is why RGB and similar formats exist for video and images. Quite simply, there is no space efficient analog method of producing an exact arbitrary and continuously variable frequency of light, let alone one that may have multiple spectral components to produce a color that can't be expressed with a single wavelength. Because of this, we have to store individual components of the signal itself and reproduce it from that. RGB happens to be the standard for computers because it was one of the easiest options to work with in hardware (and still is).

In contrast though, sound does not have these issues. It's trivial to produce a (almost) continuously variable frequency audio output in a rather small amount of space, and you can equally easily produce complex composite audio waveforms that can't be easily expressed as a simple set of frequency and amplitude components. Because of this, it makes more sense to just store the audio waveform directly, especially since compressing a single discrete signal is not particularly hard (especially compared to compressing the multiple not entirely independent signals that constitute color data in an image).


Very interesting question! My first thought was wave frequencies, like here. However, you could similarly classify light by wavelength -- e.g. blue is 435-500nm. And RGB is an abstraction on top of that, which is more similar to dye mixing.

So I am also curious to know if something like this exists for sound.


I know, is't the concept of abstractions like this existing cool? Because if you think about it, these abstractions can become more helpful than we might first think!


Was going to respond that that sounded a lot like MIDI ...but, unless you played 8bit arcade games in actual arcades or have been using computers long enough that you remember the old Turtle Beach soundcards, good chance you don't have recollections of MIDI.


Ok, so what is MIDI exactly? Send a link or explain, please?

Technically, it's probably less a method for creating/producing sounds than managing sound-delivery? Which is to say it was more a transport/mixing protocol than a sound-origination method.

Reason we mention it was that it was frequently associated with early, low-power, digital sound systems that were most-popularly used to produce/create sounds. Components within a MIDI chain could use proprietary methods to produce sounds (often using descriptive means for doing so), but my recollection is that that was simply related rather than part-and-parcel of the overall spec.

It's been since the 90s that I've used a specifically MIDI-enabled system, so, my memories are very foggy at this point.


What is 'MIDI', I know like what it is...... but like what is it's background? And would it be a similar format to NPS (Note, Pitch, Speed) like RGB scale?


MIDI is used for communication between synthesizers and other musical equipment. Within a track in a MIDI file you can store whether a specific note is on, it's duration, and any modulation / effects. It just came to mind since it's really the bare minimum of sound data you can store.


Classic DEV Post from May 2 '19

The Art of Programming

One of the most consolidated misconceptions about programming, since the early days, is the idea that such activity is purely technical, completely exact in nature, like Math and Physics. Computation is exact, but programming is not.

PDS OWNER CALIN (Calin Baenen) profile image
I am a 13 (as of Oct 30 of 2019) yr/o developer (I have been developing mini-projects for 4, years now, since I was 9), who makes projects in languages like: Java, HTML, Python 3, JS, CSS, and C#.