>> I'm unsure if we will need the 180Mb databases... No,...at least not in praktise. But I DL-ed it anyway to have some reference. Yet more scienetific links: http://www.mat.ucsb.edu/240/C/ >>it seems that the dolby headphone technology isn't listener-specific >>(at least, it is not so ear size-dependent) Yeah, they use a 'single universal HRTF setting'. (whatever that may be) Hmm...I guess it's some average of common HRTFs. In PowerDvD there is also a "Dolby virtualizer", (probably the same you mentioned in WinDvD) Well, it sounds like a normal Room Reverb with some phase inverting to me. I wonder if it's the same technology as "Dolby's Headphone": http://www.dolby.com/dolbyheadphone/ I hope not.. Thanks for the Headwize links. Good and usefull reading stuff. Can't wait to start on this. (duh...I wish I had an extra 'lifetime' to do all the things I want to do....) /LeMury
>> Headwize yep, a very good source of information -- lots of articles ---------- after browsing the Net (including the dolby.com site etc..) I finally created a prelimenary specification for the "kX Surrounder+" effect: )) input: 2 stereo + 7.1 content (10 ins) output: up to 7.1 (8 outs) internal routing: stereo -> [expansion: Surround/ProLogic, ProLogic II, ProLogic IIx?] -> mixer 5.1 / 7.1 content -> [expansion: a-la DolbyDigital EX] -> mixer expansion is optional and can be turned on and off (that is, at least two switches: 'Decode Surround / Movie mode' and 'Decode 6.1/7.1 Surround') the decoder is obviously applicable only when playing back any stereo sources with encoded information (such as a stereo TV in, an AVI file with generic movie sound track, an audio CD with DolbySurround/ProLogic track) the expansion is not related to AC-3 stuff at all the expanded signal is mixed with '5.1/7.1 content', thus, enabling further processing (e.g. dolby surround encoding ) the mixer has the following output options: 1. prologic II encoder (compatible with prologic 1 and dolby surround by design) --> two stereo channels - to be connected to external decoders, HiFi systems or to be recorded (we might have problems with prologic ][ since its encoding/decoding algorithm is somewhat complicated and not fully described) 2. headphones: - [hrtf]: a-la dolby headphones with 2-3 different room settings --> two stereo channels the reverberation algorithm might be shared with our Reverb effect (that is, probably an EAX3 reverb should be used) sub-settings: '5.1', 'stereo', 'monitoring' (the last one is used to simulate audio monitors) sub-settings select the preferred HRTF algorithm and depend on source signal type (movies/games or stereo tracks) 3. speakers: [virtual] - 2.0/2.1 - 3.0/3.1 - 4.0/4.1 - 5.0/5.1 - 6.0/6.1 - 7.0/7.1 for any non-5.1 setting, an HRTF algorithm similar to dolby virtual speaker is used speaker settings: - 'direct' (any sound data sent to non-existent channels is lost) -- no downmixing is performed - 'downmix' (simple mix: center->left&right; rear->front -- depending on speaker config) - 'copy' (rear=front) - 'surround' (hrtf-based with virtual speakers) bass options: LFE channel has 3 main options (the way it is currently implemented): - 'use physical LFE channel' (if inactive, the LFE stream is mixed with Left & Right channels) - 'redirect bass' (if active, the Left & Right stream is filtered and LFE information is extracted and mixed with the direct LFE stream) - separation frequency ---- the dolby prologic 2 decoder has (per specs) two options: 'music' and 'movie' and three controls: 'center width', 'panorama', 'dimension' ---- we could probably investigate the ways Creative implemented similar effects for 10kx-cards (no reverse-engineering, of course ) kxctrl -mx <dll name> will do the trick ---- trademark / copyright issue: we won't use any registered trademarks, so, for our purposes, all prologic/surround stuff should be called 'Surround' ('Surround encoding' / 'Surround decoding') we should also invent a good term for "virtual speaker" and "hrtf-based headphones" /E
[almost forgot]: the code for the effect will be generated on-the-fly depending on the options currently active I will implement a function similar to 'update_microcode' and will add it to the SDK (as well as enable this functionality for the kX Dane editor) probably, it would be nice to implement a dynamically generated C-like syntax for Dane, too (DynamicDane) (something similar to the present prolog/epilog code, but much more programmer-friendly) for instance, here's a [possible] C++ code void generate_surrounder(int pgm_id) { KXDynamicDane d; d.begin(); d.guid("2b8b7fa8-98b9-4f6e-81a0-400d3ba39c6f"); d.name("Surrounder+"); d.declare_input("left"); d.declare_input("right"); if(user_option&DECODE_SURROUND) { d.declare(temp,"tmp"); d.MACS("left",.....); d.MACINTS(...); } ... if(user_option&ENCODE_SURROUND) { d.declare_static("delay"); d.MACINTS(...); } ... d.update_microcode(pgm_id); } --------- /Eugene
Pffffeeeww,....You have been quite busy I have to properly 'digest' all this firts.., but it sounds great and ok! (especialy the [almost forgot] part.. ) I have just started on a real "headphones virtualizer" a'la Dolby/WinDvD/Creative. Looks promising so far. To me this is vital because I need some means to hear a decent Surround/5.1 sound representation over headphones for 3d development purposes. Sure, I could buy a surr.speaker set, but that won't work at night, besides that, I like the 'isolation' when programming, thinking and uhh...drinking) I'll keep you posted. /LeMury
Considerations on how to simulate 'Surround Speaker Sound' on a Headphone Eugene, Max, The direct use of HRTF data is, although the "best" approach, very hard to implement in general, and in 10kx particulair. It would require convolution between the HRTF's impulse response (HRIR) data and the input data stream. I have no idea how this could be done in 10Kx. An alternative way is to 'modell' a HRTF rather then using a fixed HRTF data set. (Creative's Headphone virtualizer also uses this approach as far as I could tell from the code) After studying several docs here; http://interface.cipic.ucdavis.edu/CIL_html/CIL_publications.htm it appears that PRTFs (Pinna Related Transfer Function) almost only gives Elevation clues and hardly any Azimuth clues. Correct me if I'm wrong, but when listening to, let's say a 5.1 surround speaker set, one does NOT get different Elevation clues from the speakers itself. Only Azimuth. I.e. if a fixed test signal is output in sequence (surround panned) to the speakers, one only gets different (changing) Azimuth clues. The elevation remains the same. (speaker hight placement) If there are any elevation clues, then they must already have been recorded in the audio signal. Please comment on this observation! So, If this is true, then a lot of HRTF data can be ignored, namely the elevation data and the PRTF part. This could simplify implementation drasticly. Good results seems to be possible with using only: -IID / ILD (Interaural Intensity/Level Difference) = sound pressure -ITD (Interaural Time Difference) = frequency dependant delay -Head Shadow (causing ITD) = Low pass filtering Of course this approach has it's drawbacks. It could work fine in the Horizontal Front Range, but it would be hard to make a good 'Back' perception. I have made several Freq. response plots from Kemar HRTFs at 'Back' Azimuth. Hope I can derive some function from it. Also; in order to get an 'Out of head' perception, some form of Room reverb must be applied. But this would also be the case with the direct HRTF approach. Suggestions/comments are welcome, /LeMury
>> An alternative way is to 'model' a HRTF rather then using a fixed HRTF data set. [one possible approach]: imagine we have 5 virtual speakers each position is given, including elevation so, we have 5 3-D sources, azimuths, distance and so on now we simply process the 5 incoming streams with 5 parametric equalizers (each equalizer has its own parameters that are related to =constant= speaker position) this won't emulate the real 3-D, but will at least emulate '5.1 speaker set' of course, the parameters for the equalizer can be simplified (thus giving the curves that are close to the given HRTF data) note that the effect should have 8 inputs (corresponding to all the 8 directions: n,nw,w,sw,s,se,e,ne) this is also true for the 'surrounder +' effect itself (+ one additional 'LFE' input and two 'stereo' inputs ==> 11 inputs ) [some thoughts]: the above scheme doesn't deal with 3-D source 'elevations' (only with speaker elevations) this is not 100% correct the elevation is 'emulated' by creative drivers by using a separate input with a custom filter applied that is, if the source is elevated, its 3-D position components (5 or 8) are decreased, while the 9th component is increased this seems to be a good idea so, the final 'Surrounder+' might have 12 inputs -=-=-= note that the elevation and spatialization is calculated by the driver, not by the DSP code the DSP code receives a set of [already distributed in the space] waves that correspond to an 'ideal 8-channel speaker-set' -=-=-=- yep, I'm sure this will be sufficient so, we have 8 different [pre-calculated!] parameters for each 'virtual' speaker [and these can be configured via a custom user-friendly application ] these parameters include: * level / attenuation * left/right delay * filter curve (it depends on the azimuth and elevation angles) of course, I doubt we will need 8 different filters -- we will need to optimize a little -=-=-=- "room reverb" - yes, of course however, at the moment I suggest we don't include this part of the code in the Surrounder+ effect /E
further reading yes, I know it is not the main task at the moment however, here's a good link: http://www-ccrma.stanford.edu/~jos/waveguide/FDN_Reverberation.html the FDN-based reverberation is used by Creative Labs moreover, they have some patents assigned (e.g. us6188769 and the 'inventor' is said to be Jot! -- can be downloaded from our site) there's also some info about spatialized reverberation, as well as lots of pages dealing with different filters and DSP (http://www-ccrma.stanford.edu/~jos/waveguide/ and http://www-ccrma.stanford.edu/~jos/filters/Index.html) moreover, the site has an open-source audio processing library written in C (http://www-ccrma.stanford.edu/software/stk/) /E
and one more thing please review the following topic: http://www.driverheaven.net/showthread.php?s=&threadid=26487 /E
1. Great,..more reading stuff.... No,..j/k. Great info + SDK. 2. I'll buy a second Audigy tommorow for the PC I'm programming on, (don't wanna run up & down from my working room to my studio all the time), then I will check out your '3dsetup' and get back to you. /LeMury
btw, the headphone effect might be used with live, too -- but it uses =too= many resources... and please check the two 'unused' inputs of the HpSp effect -- I wonder what they might be used for... /E
>>btw, the headphone effect might be used with live, too -- but it uses =too= many resources... Yep I know, I tested all versions of CL's hpsp. (they are really big). Best to test them with Audigy. >>and please check the two 'unused' inputs of the HpSp effect -- I wonder what they might be used for... Yep, I will, ..I had made Frequency Reponse plots of all hpsp inputs a while ago. (They matched close with HRTF database plots I made in MATLAB) I will look them up again as soon as I installed ADGII (ADG1 was not available anymore....) /LeMury
Should have read this thread earlier, as I got something to add to the hrtf-headphone simulation of 5.1 sound. The direct use of HRTF-Data isn't that problematic. With some changes the code of my 5.to-headphone-plugin could be reconfigured to use only static variables and not fixed 'pointers' to the itram, thereby opening the option to pass different HRTFs to the effect. It wouldn't change much for the resources needed. It would still be a resource-eater, and ofc the quality of the effect would be better with any bit of calculationtime given to it. 7.1 sound done the way the plugin is doing 5. now would take like 140 gprs and up to 270 instructions, still leaving free resources on a 10k1 even. Ofc I'd be willing to do those modifications if you should decide to go the direct HRTF-way instead of the 'filtered as if it went through a hrtf-convolution'.
>>...thereby opening the option to pass different HRTFs to the effect. Have you examined the option to dynamicly apply microcode changes through a .dll extension to switch HRTF sets? >>Ofc I'd be willing to do those modifications if you should decide to go the >>direct HRTF-way instead of the 'filtered as if it went through a hrtf- >>convolution'. Any contribution in any form to kX Project is always highly appreciated. So feel free to work out your own ideas/methods. /LeMury
>Have you examined the option to dynamicly apply microcode changes >through a .dll extension to switch HRTF sets? Not really. I thought it could be possible to pass the HRTF-Filter data using the xtram, although that would require more gpr and a lot more instructions. ( 32 gpr and 100 instructions maybe) One part of the plugin would constantly update the HRTF-Filters from different places of the xtram, changing the hrtf would take a few samples to get fully to work. (would use the skip command to access different parts of the hrtf-updater, which would then change the xtram-access registers for the next samples calculation, by accessing all 32 registers of the xtram it would take 3 turns to change the hrtf, on the other hand it could be done in much smaller portions too) Still dynamic hrtf's coded this way would increase the size of the plugin too much, specially considering the fact that ppl would only try around till they have their best hrtf and never change it again. So I fear the microcodechanges would be the reasonable but for me more complicated way.
Yes, that would be quite a waiste of DSP resources. Please have a look at the SDK ...\kxfxlib, and the demo. You'll find plenty of examples there. The da_xxx.cpp files contain the -> C++ exported Dane microcode part. You'll need MSVC 6.0 to succesfully compile a .kxl plugin. IIRC you did mention writing in C/C++ before, so it shouldn't be to hard really. /LeMury
It's not the codewriting I got the trouble with. Gonna have to get MSVC++6.0, I'm usually using the windows port of gcc (wxwindows for gui) and for the simple c-stuff lccwin32. Most of the stuff I coded was for hobby purposes and the gui wasn't something I really cared for. There was never a reason to get a commercial c++ compiler for me. Looks like I'll have to change my mind on that.