Tel: 905–690–4709 - Darryl Kirkland, Publisher


If accurate oral reproduction isn’t a major priority for your sound system, the message might be blowing in the wind.

There has been much emphasis on the capabilities of recently designed and installed sound systems in regard to their impact on music programs in churches. There’s little doubt that these new systems out perform the systems they replaced with respect to bandwidth, fidelity and power. But are those even the most important requirements? As new systems are being touted it occurs to me that no one is mentioning or discussing these systems in terms of speech intelligibility. This might be a good time to take a look at the issues that contribute to good speech intelligibility and the effectiveness of communications in general.

Now before you turn the page and go onto the next advertisement for that shiny new piece of gear with the blue LEDs on it; hear me out. It is very important that the functional requirements within the church be examined. We must explore the technical options that satisfy the real needs of that environment. What we’ll discover in this process is that the needs for music systems and speech systems are different, and actually divergent in many ways.

Let’s look at some of the requirements for a high impact music system. Music systems will vary depending on the type of music or worship style they are to support. Obviously a conservative liturgical worship style will have a different need than a contemporary high impact progressive worship style. We will primarily address the high impact musical style in the scope of this discussion.

The needed sound output levels for different styles of worship will be quite different. The difference between liturgical and high impact worship often differs by better than a factor of 10 times. A difference of 10dB to 15dB in output capability is not at all out of the question. The musical programming and content will determine the SPL capability of the system that is to support it.

I will define fidelity as the absence of distortion, extraneous noise, parasitic oscillations, resonances and frequency variations; basically the overall cleanness and reality of the sound. Regardless of style it should sound clean and natural.

High Impact music has a need for wide frequency bandwidth. Extended high frequency response will often be augmented by dedicated sub low frequency elements on the other end of the spectrum.

All sound systems have a duty to cover the entirety of the listeners’ areas with even and uniform sound coverage. We MUST cover the people space in the room; (Rule-1). In application it is desirable to keep the variance of sound levels to +/-2dB or a 4dB envelope across the main part of the seating area with only the edges falling out to -6dB. For high impact systems, this includes the upper frequency bands of 4KHz to 15KHz if the system is to maintain a sense of presence.

Regardless of the musical style it is desirable for the music to have a sense of envelopment and width. This is independent of sound output level, fidelity or frequency response bandwidth. When you look at the platform and see some number of musicians, instruments, vocal ensembles, choirs and the musical leader they are most likely spread out across the width of the platform. Your eyes, as programmed by nature, tell you this is the case and your perceptual experience expects that the sound should also have a width that is roughly approximate to what your eyes see.

When the sound system delivers a sound that is not consistent with the physical conditions there is a discontinuity that we pick up on; even if at an unconscious level. This is one of the reasons why central point source systems have fallen out of favor for high impact music oriented churches. The music sounds sterile and small when routed through them; regardless of bandwidth and fidelity. The sound originated from a point source and will have a point of focus that is inconsistent with our visual perception of the musical presentation. We want to hear the music with the sense of width that our more powerful visual sense has clued us into; it is a fact that we are more visual creatures than aural.

It is the lack of spatiality in music reproduction systems that spawned stereo in the first place. Stereo uses two loudspeakers spaced some distance apart and differences in sound levels of different instruments within the music to create a width to the sound, referred to as a soundstage, at the listener’s position.

When a signal is sent equally to the left and right loudspeaker, the apparent source of the sound (a phantom image) will be in the center in between the two loudspeakers. When a signal is sent exclusively to the left loudspeaker it will appear to be located at the left. When a signal is split unequally between the two loudspeakers it will appear to be located favoring the loudspeaker in which it is louder to some varying degree. In this way different instruments and voices can be spread out across the width of the loudspeakers creating a panorama of sound.

In order for the effect to work fully the listener must be relatively close to the center of the two loudspeakers both in respect to level and time arrival. The level from each loudspeaker to the listener must be the same in order to be able to source the sound in the middle and the sound from each loudspeaker must arrive at the listener at the same time. Differences in sound level from each loudspeaker, as small as 2-3dB, will cause a noticeable shift in apparent location of the phantom sound. Offset times between left & right also influence the perceived location of the sound within the soundstage. Differences in arrival time of the L/R loudspeaker of 5-7ms can significantly shift the apparent location of the phantom image within the sound stage. The perceived location of the sound will be at the earlier arrival time.

In Figure-1 we see a typical stereo speaker arrangement in a small room and two different listening locations; we will assume that the two loudspeakers have equal coverage at the two locations. At position POS-1 there is no level or time differential, because we are at the center of the two loudspeakers, and differentials in the program material will be perceived within the soundstage between the loudspeakers.

At POS-2 there is a difference in level of 3.6dB and in time of 4.1ms. The level of the Right speaker is 3.6dB louder, due to proximity, and the signal of the Left is delayed by 4.1ms again due to proximity and the speed of sound. From this point of observation the listener will clearly locate a sound, which was “panned” to center, very near or at the Right loudspeaker. The listener will have aural clues, the level is louder from the Right and the signal is later from the Left, that the sound should be located somewhere to the right side. This is a small room example; but what about a common church building?

In Figure-2 we have one of those Sainctanasium / Multiuse rooms that are so popular, but are not that different from a fan shaped room; in respect to width vs. depth of the seating areas. At POS-4 no difference in the arrival time or level as the point of observation is in the center. But at POS-5 we see a different situation. The level of the Left signal will be 4.6dB lower than the Right and be 22.7 milliseconds late. At this seating location the listener will clearly be locating a center “panned” sound from the Right loudspeaker.

Figure-3 considers what is happening in the front quarter of the room. At POS-6, no difference but at POS-7, things are VERY different. The Left signal is 10.1dB lower (perceived as half the level) and 36.44ms later relative to the Right loudspeaker. The listener will be pinned to the Right loudspeaker and clearly not have a point of reference from the Left by which to perceive a soundstage.

Real loudspeaker coverage exacerbates this problem when we consider the loudspeakers horizontal coverage. Previously we have assumed an omni-directional source for our Left/Right loudspeakers. Real loudspeakers typically have a coverage pattern associated with them of some horizontal and vertical beamwidth. Outside of the specified beamwidth the sound levels will drop off increasingly with angle away from the zero degree, on axis, level of the loudspeaker.

In Figure-4, the same room is shown in EASE using a 94/50 degree pattern loudspeaker in the L/R positions. At POS-4 the level and time is the same as previous; again, no surprise. At POS-5 the levels are 10.3dB different and 21ms offset. While the time differential from the Figure-2 scenario is fairly small, the level difference is roughly 5.7dB. WHY?

Take a look at Figure-5 where the Right loudspeaker had been turned off and only the Left is on. The level in front of the loudspeaker is roughly 107dB and on the opposite side it is around 95dB; that’s 12dB difference. When the individual loudspeaker coverage is examined, we see that each loudspeaker’s contribution to the room does not even come close to covering the whole width of the space.

A line array? If you put a 6 element tall 90 degree wide line array in the same space, aimed horizontally straight ahead, (at the same area as in Figure-5) it is roughly 10dB differential from the left to right side. That is only 2dB better than a traditional 90 degree wide loudspeaker in regard to horizontal coverage.

Given the constraints of level and time differences between the Left and Right loudspeakers in a real auditorium can we really call it stereo? Well yes we can; that’s right!

It can be argued that the center 20 or so percent of the room will appreciate an actual stereo mix on the PA; as for the rest of the room it’s dual channel but not really a convincing stereo image.

For better than 3/5 of the room the listeners will be fairly anchored to the loudspeaker array that they are sitting in front of and for those seats, beyond the width of the loudspeaker placement, completely attached to the adjacent loudspeaker. Those folks in these areas will only get a strange mix; depending on how wide the operator is panning different sources in the mix.

A split source does provide a wider sense of sound that generally has some size to it. It is capable of providing a sound that is dimensional in nature and for the most part correlates better to what our eyes are seeing on stage in respect to music types of program. The areas where the L/R coverage overlaps will have comb filtering effects, with regard to frequency response, but are typically dense in nature and are not especially disconcerting to the listener.

In terms of intelligibility, with regard to singing and melodic content, the intelligibility in an L/R configuration can be quite acceptable. When the sound operator understands what panning actually does in respect to most peoples perception in the room the sound from a L/R system is basically more consistent with the visual situation in a room when it comes to music.

The L/R configuration has significant negative issues when it comes to speech reinforcement. The offset times and levels cause the listener to be disconnected from the talker and actual speech intelligibility can be effected. We have examined what some of the constraints are in a L/R system, as far as apparent location is concerned, and some of the drawbacks inherent in such a configuration.

Part of being connected to the talker is a visual thing; we see them in a particular place and all of our experience has conditioned us to expect the sound to come from that same place. When there is a disparity in the location of the talker and the apparent location from which we hear that individual there will be a corresponding increase in listener stress. It is this listener stress that contributes to what causes people to become disinterested or distracted from what the talker is speaking about. This disconnection truly diminishes the effectiveness of communication; especially considering the interpersonal nature of the type of message presented in a church setting.

Speech intelligibility is paramount in presentation of the spoken word. When there are multiple arrivals of a spoken sound, that arrive at the listener later than 15ms to 20ms it will interfere with the clarity of the direct sound of that speech program. This is true for reverberation and acoustic echoes and also echoes we introduce from our selection of sound system loudspeaker layout. Refer back to Figure-2 through Figure-5 and take note of the latency of the arrival times, around 20ms through 40ms, between the left and right loudspeakers. These latent arrivals will degrade speech intelligibility in the areas of concern.

We can not minimize the importance of effective speech communications in a house of worship. It can be strongly argued that, actually, all of the other activities in this environment are in a supporting role to the spoken word part of the typical service in a house of worship. The musical program prepares the hearts and minds of those in attendance to receive the spoken word and as such it is also important not to minimize the importance of effective musical presentation either. As examined above it is clear that the type of sound support system that is optimal for music may not, likely not, be optimal or even prudent to use for the speech portion of a service that involves speech.

Effective communication of the spoken word is absolutely crucial to the mission of the majority of houses of worship. We can measure actual speech intelligibility in a number of ways, under a number of standards, but there exists no metrics by which to measure the connectedness of the listener to the talker. There are psychoacoustic principles that determine this connection to the person delivering the message, but measurement of actual people would be difficult and intrusive. Just ask yourself how often during the spoken part of the service, has your mind wondered off to other subjects; lunch, work, a sports game or so many other possibilities? It may not be the subject matter or delivery, but possibly the loudspeaker system is inappropriate for speech reinforcement and contributing to listener fatigue. Effective interpersonal communications is a matter of intelligibility, naturalness, and connection to the person delivering the message.

Powered by