fluid-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [fluid-dev] improving musical timekeeping


From: address@hidden
Subject: Re: [fluid-dev] improving musical timekeeping
Date: Mon, 10 Feb 2020 01:51:09 +0100
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.4.2

2020.02.09. 20:35 keltezéssel, Marcus Weseloh írta:
Hi,

first of all: this is a very interesting topic, thank you for bringing it up!
My pleasure; and I thank you for the moral support.

I do sympathise with your idea and I don't think it is a very unique use-case at all. I also use MuseScore, but mostly for rehearsing. So I let MuseScore play some voices of a piece while I play another voice on top. And I usually choose sounds with a very short attack phase in MuseScore, so that the attack phases don't mess with the timing. I have used the OnTime offset shifting of MuseScore 2 from time to time, but as MuseScore 3 has crippled the user-interface for that feature (you can only affect a single note at a time), it is now way too much hassle for me.
Digression: On this scenario (interactive playing on top of scored music), my idea is that the playback can be performed "on time" via the features under discussion, while with the interactive play - this is an area we did not touch on before - a compromise could be achieved (again using the musical onset markers) by delaying each interactively played note/sample just enough to make any and all notes played to be delayed by the same consistent amount, because I have a strong hunch with a short but consistent delay between keypress and note onset, the interactively playing musician's brain will be able to compensate for that delay by pressing keys earlier so that the interactive notes sound in time with the notes the computer performs from score. The system can be made aware of the musical onset moments of the interactively played notes, and notates those moments instead of the moment of the MIDI keypress. Using this arrangement even interactively recording a new track will be synchronized to the score without kludges - if the delay can be kept short enough to fall within the brain's ability where it can comfortably compensate for it, One option for this is letting the user choose the maximum delay, and the system truncates attack phases of interactively played notes down to that limit. It would be a tradeoff between completeness of the interactively played notes (e.g. for a performance - in effect reverting to the current behaviour of no attack phase compensation) on the one end, and completely omitted attack phases but best interactive playability (which would be a "best effort" full attack phase compensation for the case when we only learn about the note at the moment it should be on).

So I agree: it would be great to have a good solution to this problem.

Let me add in advance - instead of pointing this out at each point -, that in my opinion this compensation logic is essentially already in place today, with the sample attack phase hardcoded at value zero: because the synthesizer does already compute the destination moment - the output audio stream frame (sample) number - where it has to start playing the sample from the sample's origin (the first sample = sample offset zero). What would change is merely aligning a new origin (with an offset that is potentially non zero) point of the sample with the destination moment. (Granted, my knowledge of the standard is limited currently.)
Let's assume that SoundFonts would be extended so that they contain information about the length of the attack phase and the position of the first "meaningful" sample point, i.e. the sample point that should be "on beat". Lets ignore the fact that there are MIDI messages that affect the sample offset. And lets also ignore that the choice of which sample is heard can also depend on oscillators.
This however can - and indeed, already is being - precalculated, so is known (assuming by "sample" in the above sentence you mean a time-series of measured audio values - and not just one of such value of the series; afaik the English "sample" can mean both in this context.)
And that for a single note-on, many different samples (with different attack phases) could be started simultaneously.

It would be a meaningful musical discussion to hash out why this needs to be done at all, and whether in those cases the sound resulting from the ensemble of those samples played together ends up having a musical note onset moment instead of the individual samples: if one tries to reproduce the sound of a preexisting musical instrument, the sound of that natural instrument when played has a musically meaningful onset - so by extension, reproducing the sound of that instrument should have conceptually the same onset, even if for whatever reason the reproduction of the sound of the instrument ends up getting actually constructed from multiple "samples" played in parallel - the sound still represents the same instrument musically, and therefore ought to be considered the same musically.

(Whether all instrument types have such musically meaningful moments in their sounds, or not - for the above assume an instrument that does -, or which ones there are altogether, is a meaningful musical discussion; and maybe some instruments will not have any such phases; or some instruments will have some, while other instruments will have others. As a side note, this is kind of where I think ADSR also originally came from, trying to capture some musically meaningful and perceptually pleasant/necessary phases of an instrument's sound.)


Then the synth would have to start playing the sample *before* the beat, in order to play it on beat. In other words, the sampler would have to act on a note-on event before this note-on event is actually due. And in order to decide which sample to play and how much attack phase time needs to be compensated, it would have to examine all CC and other messages that could influence sample choice and sample offsets leading up to that note-on event. And assuming we don't want to put an artificial limit on how long an attack phase of a sample is allowed to be, that effectively means that the synth would need to know and analyse (or even simulate) the complete MIDI stream right until the last event before starting the playback.
Correct. For rendering preexisting/non-interactive/non-realtime/score-based music. The notes would be known ahead of time, so everything could be precalculated/simulated. Although an artificial limit on how long an attack phase is allowed to be would not be good, it could be reasonable to lessen the burden by allowing the software driving the synthesizer to speficy an upper bound, beyond which attack phases would get truncated. The synthesizer could maybe provide a suggestion for this to its client by precalculating some reasonable value for maximum attack lengths, say over the entire set of samples the music will be played over. The ideal solution would be to hide it all, and ask for the entire score in advance and calculate everything perfectly internally. But for cases where this is inconvenient, a lookahead window could be used, where the synthesizer is continuously fed events far ahead of current audio output time so that there is a reasonable assumption that (almost all) of those events will still be known early enough to the synthesizer to synthesize them on time (if not, then it's best effort, so some attacks will end up getting truncated somewhat: if you whip up an instrument with a two minute attack phase, you better not send a note-on event before two minutes into the piece; think polyphony and note stealing - another area where users can expect degradations in output if they ask for unreasonable things).
That is obviously impossible for live playback, you say that yourself.
Well, on second thought, yes and no - see digression above.
But it also doesn't fit how synthesizers like FluidSynth are used by MuseScore, Ardour and similar programs. Because as far as I know, those programs are MIDI sequencers themselves. In other words: *they* control which MIDI messages to send and - most importantly - when to send them. They don't pass a complete MIDI file to Fluidsynth to be played, but rather send MIDI messages in bursts or as a continuous stream.
I agree, but I don't consider this a conceptual issue, merely an API/implementation issue. I don't see any conceptual reason why MuseScore couldn't send the entire score to the synthesizer in advance, or at least ahead of time with a reasonable look-ahead window. (And we can open up discussion about making this more efficient by introducing a caching mechanism so that when an edit is made, not all instruments/tracks/MIDI channels have to be transferred to - and parsed  and precalculated by - the synthesizer anew.) Maybe it is possible to edit the score in MuseScore during playback (but then does anyone really does that)? If it is possible, that could still be made possible by a sequencer API that allows adding events with timestamps and returns an identifier to each event, so that the client can later change or remove events.
And they have very good reasons to do it that way.
So you do consider this a conceptual issue? What would be those good reasons? I can't think of any for MuseScore - but I am not an expert user of MuseScore.

So in my opinion, if we wanted to implement a system like you propose, it would have to be implemented in the MIDI sequencer. In other words: in MuseScore, Ardour and all the other programs that use MIDI events to control synthesizers (which also includes FluidSynths internal sequencer used to play MIDI files).
In light of the above, do you still think this is the case? And if yes, which parts would end up in the client (MuseScore, Ardour, etc.), and which parts (if any) would end up in the synthesizer? Would the client query the synthesizer about note attack lenghths of each and every note, just to then turn around and advance (bring earlier) the corresponding note on event before sending it to the synthesizer? Wouldn't that logic be better done by the synthesizer itself then?

So maybe all that is needed is better sequencer support for shifting OneTime offsets for notes, tracks and scores.
I don't know what OneTime offsets are, but if you don't have information on the per-sample (or per-note/per-pitch) attack lengths of each note, how much do you shift by? Furthermore, if different pitches end up having different attack lengths (think resampled samples) there is no single correct value to shift by... so I'm probably missing some part of your logic?
MuseScore is definitely lacking in that regard, it needs better user-interfaces for selecting multiple notes and affecting their OnTime offset. Maybe even support for some database of popular soundfonts that lists the OnTime offset for each note of each sample. MuseScore could then read that database and adjust all notes in a track automatically, if the user decides that is would make musical sense.
This sidecar file to soundfonts I feel to be a quite workaround-like approach. It could achieve the timing in the short term, yes (but maybe still not perfectly? - see above). But I could not imagine it as a long-term solution: it would be specific to MuseScore, the sidecar files could get separated from the SoundFonts (misplaced, renamed, mixed up, not found "oh, I have downloaded this soundfont, but cannot find the sidecar file and now my music is all badly timed, does anyone know where to download it from?"), how do you know what value to write into the sidecar file unless you have a soundfont editor to open up the font (or you are the soundfont author) - at which moment it becomes much more easier to mark the moment in the soundfont editor and save that with the soundfont, than having to manually write it into an external file, taking care not to make a mistake (different note, diferent velocity layer, etc.), etc.

Cheers
Marcus
- HuBandiT



reply via email to

[Prev in Thread] Current Thread [Next in Thread]