Adding Sound to a React Game

Recently, I've been spending my weekends creating a simple game for my child using React. My child was so engrossed in a game called "Infinite Stairs" that I decided to create a version that retains the core fun(?) while removing potentially harmful addictive elements. It's a simple game where you climb stairs left and right, earning points to buy avatars and customize your character. However, I removed the feature of buying in-game currency with points. The title is "Finite Stairs". I was having fun meticulously creating the stairs and characters, but an unexpected problem arose.

This game plays sound effects when climbing stairs and when failing. This is where a cross-browser issue occurred[4]. Honestly, if I only used Chrome, it wouldn't have been a problem, but curiosity led me to dig a little deeper. That's the beauty of side projects: you can do things even if they aren't strictly necessary, just because you want to, and avoid things you don't want to do, even if they are needed. :)

In this post, I'll introduce the problems encountered while using sound effects in a web game built with React and how to use the "Web Audio API" easily and safely to handle sound effects.

What was the problem?

Even though it's a simple game made with React and web technologies, all essential game components had to be implemented, including sound output. The first thought was to use the <audio> element to play sounds[6]. Yes, I thought so too and simply used <audio> to preload two necessary sounds and played them using the play() method whenever needed. There were no issues in Chrome during development, but Safari caused trouble[4][7].

The first time, the sound played without problems just by calling the play() function. But from the second time onwards, play() didn't work. This issue occurred not only in Safari on macOS but also identically in Safari on iOS[4]. This was a major problem since the intended device for my child to play on was an iPad.

Honestly, I'm not sure which behavior aligns with the specification; I didn't investigate that far[4]. It could be "implementation-defined" (up to the browser developers) or just a bug. I suspect it's a bug, but seeing it unresolved for a long time makes me unsure. Perhaps this issue isn't confined to games. You might want to add sound effects to buttons on a website, for example. It seems like a problem that could occur in various places.

Various Attempts

When I first discovered this problem in Safari, I thought of several solutions. The easiest was to manipulate the currentTime property of the HTMLMediaElement interface. currentTime indicates the current playback position in seconds. A value of 0 is the starting point, it updates in real-time during playback, and equals the sound's duration upon completion. According to the spec, it's readable and writable.

function playSound(sound) {
	sound.currentTime = 0;
	sound.play();
}

This isn't the exact code I used, but I simplified it for clarity. I hypothesized that the sound wasn't replaying because currentTime wasn't resetting to the start, so I explicitly set currentTime to 0. Doesn't it seem like this should work? But it didn't. Checking currentTime in the console showed 0, but playback didn't occur correctly.

After various struggles, I found a strange workaround: forcibly calling load()[7].

function playSound(sound) {
	sound.load();
	sound.play();
}

Calling load() every time the sound needs to play forces the file to be reloaded each time, allowing the sound to play normally. Speculating on the cause, there might be a synchronization issue between the sound's metadata and the JavaScript object modeling it. Reloading would re-initialize it every time. But it feels messy. The developer tools show continuous network requests. Caching might cover it, but it doesn't seem like the right answer. It feels like introducing another potential bug to avoid the current one.

Network Request Image

From what I could find, there were only speculations and no official explanations. Anyway, it was a very old problem[4]. I decided to just remember that Safari has a bug with currentTime.

While the single-play issue was resolved (sort of) by calling load(), sound latency was also a problem. When the jump key was pressed, there was a slight delay before the sound played. Even for a kids' game, the delay was unacceptable to me. It spoils the fun of the game. So, I decided to use the Web Audio API, something I hadn't intended to use for such a small project. It's actually not that complicated. :)

The Problematic Code

First, let's look at the full source code using the method that called load() every time.

// Note: Original code included 'audios' defined outside the hook.
// Assuming 'audios' is intended to be managed within or passed to the hook for proper encapsulation.
// For this translation, let's assume it's initialized within useEffect or passed in.
const audios = new Map(); // Placeholder for demonstration

function useAudio(data: Record<string, string>): { loaded: boolean; audios: typeof audios } {
  const [loaded, setLoaded] = useState(false);

  useEffect(() => {
    const promises: Promise<unknown>[] = [];
    audios.clear(); // Clear previous sounds if data changes

    Object.keys(data).forEach((key) => {
      const sourceUrl = data[key];

      promises.push(
        new Promise((resolve) => {
          const sound = new Audio(sourceUrl);

          // sound.load(); // Preloading attempt - might not be necessary depending on browser

          sound.addEventListener('canplaythrough', resolve, { once: true }); // Resolve once ready
          sound.addEventListener('error', (e) => { // Handle loading errors
              console.error(`Error loading audio: ${key}`, e);
              resolve(); // Resolve even on error to not block loading state
          });


          audios.set(key, () => { // [1]
            sound.load(); // Workaround for Safari repeat play issue
            void sound.play();
          });
        })
      );
    });

    Promise.all(promises).then(() => { // [2]
      setLoaded(true);
    });
    // Cleanup function to remove listeners if the component unmounts or data changes
    // This part was missing in the original snippet but is important in React
    return () => {
        audios.forEach((_, key) => {
           // How to properly clean up Audio objects and listeners would depend
           // on how they are stored and managed. If stored directly in the map,
           // removing from map might be enough if no other refs exist.
        });
        audios.clear();
    };

  }, [data]); // Add data as dependency

  return {
    loaded,
    audios,
  };
}

I created a custom hook that loads sounds and makes them playable. It takes an object data with sound names as keys and paths as values, and returns a loading status flag (loaded) and a map (audios) containing functions to play the sounds.

const { loaded, audios } = useAudio({ jump: '/sound/jump.wav', fail: '/sound/fail.wav' });

The user can find the necessary sound's closure function ([1]) in the map and call it to play the sound.

audios.get('jump')(); // Plays the jump sound

Excluding the sound output delay, this code works correctly.

The component using this hook uses the loaded flag variable to show a loading bar or related UI, and proceeds with the game once loading is complete. The loaded flag was simply implemented using Promise resolution ([2]).

How Was It Solved?

Searching on MDN, the introduction to the Web Audio API comes up first[2].

"The Web Audio API provides a powerful and versatile system for controlling audio on the Web, allowing developers to choose audio sources, add effects to audio, create audio visualizations, apply spatial effects (such as panning) and much more."

It's an API that allows providing powerful audio in various forms on the web[2][3]. You can add audio effects like reverb or delay to audio sources. You can access data when music or sound is playing to visualize fancy wave patterns or acoustic information[9]. It even allows low-level audio programming, so you can create electronic instruments[8]. You can even create effects like reverb yourself. You can take voice input through a microphone and apply real-time echo like in karaoke. You can even build music creation applications on the web[3][8]. Many commercial services actually exist.

Web Audio is composed of combinations of audio nodes, forming what's called an audio graph[2][8]. I was interested in audio programming before and used Apple's AudioUnit framework; Web Audio and AudioUnit have very similar abstractions. There are independent functional modules related to audio processing, and these are used as nodes connected in a graph structure to perform a series of audio functions like outputting sound[8]. An image found on MDN illustrates the nodes and graph well, so I'm including it.

AudioContext Graph

The Web Audio API has so many features that it enables low-level audio programming; entire books are written about it[3]. What's needed right now is just to quickly output the desired sound without any special effects. This will likely be just a taste.

Now, let's modify the code to output sound using Web Audio. It will be slightly longer than the previous code.

First, get an instance of AudioContext[3][8]. This is created once and reused. The AudioContext is the core of Web Audio, and its instance can be thought of as a unit of the audio processing graph[2][8]. It's like the document in the DOM. You use the AudioContext instance to configure the graph, create nodes, and so on[2].

const audioContext = new AudioContext();

Then, fetch the sound using the audio file path. The path is represented by the variable sourceUrl.

const res = await fetch(sourceUrl); // [1]
const arrayBuffer = await res.arrayBuffer(); // [2]
const audioBuffer = await audioContext.decodeAudioData(arrayBuffer); // [3]

Use fetch to get the data ([1]) and convert it to an ArrayBuffer to handle binary data ([2])[3]. We won't directly manipulate binary data here, but using AudioWorklet allows for low-level audio programming. In the last line, audioContext.decodeAudioData() decodes the binary data into audio data[3][8]. Now audioBuffer is playable audio data.

To play the audioBuffer, create a buffer source node within the audio graph capable of playing audio data, and pass the audioBuffer to it[3][8].

const trackSource = audioContext.createBufferSource(); // Create the source node...
trackSource.buffer = audioBuffer; // Pass the audio buffer.

Now, execute the start() method to make trackSource output sound.

trackSource.start(); // ??

But there's no sound yet.

Just like creating many HTML elements with JavaScript doesn't make them appear on screen unless attached to the body, the trackSource needs to be connected to something that can output its sound[2][8]. That is, connect the sound-generating node trackSource to the node abstracting the speakers[8][9].

trackSource.connect(audioContext.destination);

You don't need to create the output node separately; just use audioContext.destination[2][8]. This will output the sound to the device currently set in the OS.

Now, let's go!

trackSource.start() // Plays the sound

The useAudio hook modified to use Web Audio is below.

const audios: Map<string, () => void> = new Map();
// Store AudioBuffers to avoid decoding multiple times
const audioBuffers: Map<string, AudioBuffer> = new Map();
// Keep a single AudioContext instance
let audioContext: AudioContext | null = null;


function useAudio(data: Record<string, string>): { loaded: boolean; audios: Map<string, () => void> } {
  const [loaded, setLoaded] = useState(false);

  useEffect(() => {
    // Initialize AudioContext only once, potentially on first user interaction
    if (!audioContext) {
        const AudioContext = window.AudioContext || window.webkitAudioContext;
        if (AudioContext) {
             audioContext = new AudioContext();
        } else {
            console.error("Web Audio API is not supported in this browser");
            // Handle lack of support, maybe set loaded to true but audios map remains empty
             setLoaded(true);
            return;
        }
    }


    const promises: Promise<void>[] = [];
    audios.clear(); // Clear previous playback functions

    Object.keys(data).forEach((key) => {
      const sourceUrl = data[key];

      promises.push(
        new Promise<void>(async (resolve, reject) => {
          // Check if buffer already exists
          if (audioBuffers.has(key)) {
              audios.set(key, createPlayFunction(key));
              resolve();
              return;
          }

          try {
            const res = await fetch(sourceUrl);
            if (!res.ok) throw new Error(`HTTP error! status: ${res.status}`);
            const arrayBuffer = await res.arrayBuffer();
            // Use the existing audioContext instance
            const audioBuffer = await audioContext.decodeAudioData(arrayBuffer);
            audioBuffers.set(key, audioBuffer); // Store the decoded buffer

            audios.set(key, createPlayFunction(key)); // Set the playback function
            resolve();
          } catch (e) {
             console.error(`Error loading or decoding audio: ${key}`, e);
            reject(e); // Reject the promise on error
          }
        })
      );
    });

    Promise.all(promises).then(() => {
      setLoaded(true);
    }).catch(error => {
        console.error("Error loading one or more audio files:", error);
        // Decide how to handle partial or complete loading failure
        // Maybe set loaded to true anyway, but some sounds won't play
        setLoaded(true);
    });

    // Cleanup: Clear maps if component unmounts or data changes significantly
    // Note: AudioContext typically doesn't need cleanup unless you want to release resources fully.
    // Buffers might be kept if they are likely to be reused.
    return () => {
        // audios.clear();
        // audioBuffers.clear(); // Decide if buffers should persist across re-renders/unmounts
    };

  }, [data]); // Rerun effect if data changes

  // Helper function to create the playback logic
  const createPlayFunction = (key: string) => () => {
      if (!audioContext || !audioBuffers.has(key)) {
          console.error(`Audio context or buffer not ready for key: ${key}`);
          return;
      }

      // Resume context if suspended (often required due to browser autoplay policies)
      if (audioContext.state === 'suspended') {
        audioContext.resume().then(() => {
            console.log("AudioContext resumed successfully");
            playSoundInternal(key);
        }).catch(e => console.error("Error resuming AudioContext", e));
      } else {
          playSoundInternal(key);
      }
  };

  // Internal function to actually play the sound once context is ready
  const playSoundInternal = (key: string) => {
      if (!audioContext || !audioBuffers.has(key)) return; // Double check

      const trackSource = audioContext.createBufferSource();
      trackSource.buffer = audioBuffers.get(key)!; // Get the stored buffer
      trackSource.connect(audioContext.destination);
      trackSource.start();
  };


  return {
    loaded,
    audios,
  };
}

Doing this resolved all the issues cleanly. The latency improved significantly, providing a satisfying tactile feel.

Conclusion

Personally, I have a lot of interest in Web Audio, so it was fun to use it, even simply like this. Web Audio is great on its own, but it can work with WebAssembly to perform richer tasks with excellent performance. I think it's one of the great synergies WebAssembly can offer. If you're interested in Web Audio, I highly recommend subscribing to Web Audio Weekly. You can encounter various audio-related attempts on the web there.