Bugzilla – Bug 421
Crash importing malformed MP3 file using libmad
Last modified: 2020-05-25 05:45:02 UTC
The same issue occurs if libmad takes video files having mpg/mpeg extension but containing PCM audio (as they quite often do nowadays).
Fixed via patch from David Wallace in r 11229- see audacity devel ML for more info including tests.
@David and Michael, thanks for the work on this. I think we have pushed this down to borderline P3/possibly P4 now, but leaving at P3 ATM. Tested on Windows 7 x64 and OS X 10.7.4 on an initialised .cfg, with FFmpeg both available and not available. I can drag, File > Open or File > Import a WAV file misnamed as MP3 and it uses libsndfile to import, even if I choose "MP3 files" in File > Import Audio. On Ubuntu 12.04, HEAD built with system libsndfile as per default ./configure tries libsndfile but uses libmad. I assume this is due to http://code.google.com/p/audacity/source/detail?r=11458. Please confirm my assumption. However even with long stereo files, the result of libmad import is a few seconds of garbage, so not unsafe. However I still get a repeatable crash on Windows 7 x64 if I set an Extended Import rule for *.mp3 to use "MP3 files" and uncheck "Attempt to use filter in OpenFile..." - libmad takes the import. In comment 0, I said "The same issue occurs if libmad takes video files having mpg/mpeg extension but containing PCM audio". This used to happen when FFmpeg was not installed, however this now seems to fail safely (at least on Windows) with a request for FFmpeg if FFmpeg is not installed. As per bug 422, if I rename such a video file to MP3, libmad takes the file and there is a crash or hang.
I find this bug somewhat implausible and unnecessary. What kind of person changes the extension of a WAV file to be MP3 - and expects things to unconditionally work? More importantly On Windows what kind of person ignores the warning "If you change a filename extension, the file might become unusable. Are you sure you want to change it - Yes/No" On Mac what sort of person ignores the warning: "Are you sure yo want to change the extension from .wav to .mp3? If you make this change, your document may open in a different application. - Keep .wav / Use .mp3" Anyways I went ahead and followed the steps using the latest alphas: 1) on W10 with a ten minute non-wav it imported as a much shorter playable noise 2) on W10 with a 4 four hour non-wav Audacity hangs 3) on Mac with a 4 four hour non-wav Audacity crashes
*** STEPS UPDATED *** This potentially has security implications, as malformed files should not crash Audacity, but do. The libraries we use may need work to be more robust to malformed files.
(In reply to James Crook from comment #4) Perhaps use libmediainfo to identify file types before passing to import libraries so that we only pass file types that the importer can handle. https://mediaarea.net/en/MediaInfo
Upgrading to P2 because of the crash. This is somewhat of a fringe case, but a crash nonetheless - Audacity needs to handle such things "gracefully" - no crash and an informative error message
(In reply to Steve Daulton from comment #5) > (In reply to James Crook from comment #4) > Perhaps use libmediainfo to identify file types before passing to import > libraries so that we only pass file types that the importer can handle. > > https://mediaarea.net/en/MediaInfo This would certainly work and we'd never have to guess about file formats again as they seem to have them ALL covered. But, has anyone considered simply moving to the open source compliant versions of FFmpeg/libav and get rid of all our independent imports...just have one that does everything?
Another (less drastic) possibility would be to jettison libmad altogether and use libmp3lame (already in lib-src) to do the decoding. It seems to be able to detect "bogus" mp3 files just fine. In addition, we could use it for mp2 files as well and get rid of twolame.
As RM I don't intend to block on this for 2.4.0. It can stay as P2. I see the real issue as being robust to malicious files, not just detecting what type the supposed mp3 really is. I like the idea of using libmp3lame, but think it's risky for 2.4.0. FFmpeg would need too much investigation for 2.4.0, and if done should have been done early in development cycle, so not for 2.4.0.
While I think converting over to libmp3lame for MP3 decoding is a good idea, I think we could make libmad a bit safer in the interim. You see, it DOES actually report errors when it gets a bogus input stream, but we just ignore them. Does anyone know why???
(In reply to Leland Lucius from comment #10) >I think we could make libmad a bit safer in the interim. Safer and more "graceful" failure would be good in the interim.
"Does anyone know why???" Because we were lazy?
(In reply to James Crook from comment #12) > "Does anyone know why???" > > Because we were lazy? :-) Could be. Back in 2001, Dominic had added the callback for error handling, but left it all commented out.
I'll need to leave it out of 2.4.0 since it will require new strings for the error dialog. Libmad does provide his own error reasons that will not be translated, but the rest of the dialog will need to be. The libmad error strings are: case MAD_ERROR_NONE: return "no error"; case MAD_ERROR_BUFLEN: return "input buffer too small (or EOF)"; case MAD_ERROR_BUFPTR: return "invalid (null) buffer pointer"; case MAD_ERROR_NOMEM: return "not enough memory"; case MAD_ERROR_LOSTSYNC: return "lost synchronization"; case MAD_ERROR_BADLAYER: return "reserved header layer value"; case MAD_ERROR_BADBITRATE: return "forbidden bitrate value"; case MAD_ERROR_BADSAMPLERATE: return "reserved sample frequency value"; case MAD_ERROR_BADEMPHASIS: return "reserved emphasis value"; case MAD_ERROR_BADCRC: return "CRC check failed"; case MAD_ERROR_BADBITALLOC: return "forbidden bit allocation value"; case MAD_ERROR_BADSCALEFACTOR: return "bad scalefactor index"; case MAD_ERROR_BADMODE: return "bad bitrate/mode combination"; case MAD_ERROR_BADFRAMELEN: return "bad frame length"; case MAD_ERROR_BADBIGVALUES: return "bad big_values count"; case MAD_ERROR_BADBLOCKTYPE: return "reserved block_type"; case MAD_ERROR_BADSCFSI: return "bad scalefactor selection info"; case MAD_ERROR_BADDATAPTR: return "bad main_data_begin pointer"; case MAD_ERROR_BADPART3LEN: return "bad audio data length"; case MAD_ERROR_BADHUFFTABLE: return "bad Huffman table select"; case MAD_ERROR_BADHUFFDATA: return "Huffman data overrun"; case MAD_ERROR_BADSTEREO: return "incompatible block_type for JS"; Since you're not going to block on this bug, I think we should just leave it be as it will introduce some uncertainty.
+1 to Comment 14. RM decides/agrees.
See pull request: https://github.com/audacity/audacity/pull/506 Not intended for 2.4.0 inclusion. Just a reminder that it's there...
(In reply to Leland Lucius from comment #16) Testing using Gale's steps (I had to restructure to make them edible/workable)I cannot get this to work At step 5 there was no immediate freeze But at step 6 the import progress bar stopped at about 1/4 the way across and stopped there for several minutes At step 7 Cancel did indeed seem to work At step 8 I tried playing the existing 10 minute audio (used to create the export for step 1) The result was that it played fine and Audacity was not frozen - so this little bit at least appears to be an improvement.
(In reply to Leland Lucius from comment #16) Following the steps to reproduce, I get a crash with 86e5019 (pull/506) 13:03:16: Debug: LastOpenType is MP3 files 13:03:16: Debug: OverrideExtendedImportByOpenFileDialogChoice is 1 13:03:16: Debug: Inserting libmad 13:03:16: Debug: Appending libav 13:03:16: Debug: Appending libsndfile 13:03:16: Debug: Appending liboggvorbis 13:03:16: Debug: Appending libflac 13:03:16: Debug: Appending lof audacity: /home/runner/work/audacity/audacity/lib-src/libmad/layer3.c:2633: mad_layer_III: Assertion `stream->md_len + md_len - si.main_data_begin <= MAD_BUFFER_MDLEN' failed. Aborted (core dumped) The crash also occurs in 86e5019 without the Extended Import rule.
(In reply to Steve Daulton from comment #18) The crash also occurs in master 2.4.0 without the Extended Import rule. (commit eabcd35d26)
(In reply to Steve Daulton from comment #19) On Linux, I also see the crash with a 5 minute stereo WAV file renamed as ".mp3" in Audacity 2.3.3. The crash does not occur in Audacity 2.3.2. In Audacity 2.3.2 there is no "MP3 files" filter.
(In reply to Steve Daulton from comment #20) The crash in 2.3.3 (and 2.4.0) occurs without setting an Extended Import Rule.
Sorry guys for the waste of time. A shortcoming in how I set up the github actions workflows. If you'd like to test again, you can find the build here: https://github.com/lllucius/audacity/actions/runs/92346643 I'll fix the github action after 2.4.0.
Third times the charm: https://github.com/lllucius/audacity/actions/runs/92402660
(In reply to Leland Lucius from comment #23) That works for me. On Linux, following the steps to reproduce: MP3 Decoding Failed: forbidden bit allocation value
(In reply to Steve Daulton from comment #24) > (In reply to Leland Lucius from comment #23) > That works for me. > > On Linux, following the steps to reproduce: > > MP3 Decoding Failed: > forbidden bit allocation value Was this a good or bad MP3? That would depend on whether I say "Awesome" or "Bogus"! :-)
(In reply to Leland Lucius from comment #25) > (In reply to Steve Daulton from comment #24) > > (In reply to Leland Lucius from comment #23) > > That works for me. > > > > On Linux, following the steps to reproduce: > > > > MP3 Decoding Failed: > > forbidden bit allocation value > > Was this a good or bad MP3? That would depend on whether I say "Awesome" or > "Bogus"! :-) It does point out a user friendliness issue though. I'm currently just passing along the error message from libmad. I think we might want to soften those messages a bit with something more like "There was a problem importing this file." or some such.
Also, this error checking is why I don't think this change should go into 2.4.0. We need to throw a bunch of old and newly created MP3s at it to ensure we don't get false positives. It's been ignoring errors for 19 years and could have been successfully importing all manner of badly formed or corrupted MP3s with a dropped frame here and there.
(In reply to Leland Lucius from comment #25) The error message occurs with bad MP3s. Good MP3s import correctly. I agree that friendlier error messages would be nice. Does libmad also give non-critical error messages? "Oops, something is wrong but we can import it anyway" I agree that more testing is needed, but awesome so far :-)
(In reply to Steve Daulton from comment #28) > (In reply to Leland Lucius from comment #25) > The error message occurs with bad MP3s. > Good MP3s import correctly. > > I agree that friendlier error messages would be nice. > > Does libmad also give non-critical error messages? "Oops, something is wrong > but we can import it anyway" > Actually all of the errors listed below starting with LOSTSYNC are considered non-critical and are supposed to be recoverable. We've seen that libmad doesn't necessarily do a great job of recovering from "major" infractions. But, since we were just ignoring errors, we'd (and the users) would never have known if errors and subsequent recovery had occurred. I don't know how'd we recover most of them, but the LOSTSYNC fella is easy...just scan ahead until you find a SYNC word and report the number of skipped bytes to the user.
Comment 29 is getting worryingly into 'write our own new and improved libmad' territory. This bug is then at risk of major 'scope creep'. If the steps to reproduce now do not cause a crash then this bug can be closed. Other issues, such as a gloss on the error messages and even recovery options sometimes when reading fails, may merit a feature enhancement proposal. Unlike this bug, they are not P2.
Okay, let's see how this goes: https://github.com/audacity/audacity/commit/35e88d9 https://github.com/audacity/audacity/actions/runs/113646954 Importing with libmad should be much safer now. In addition, it will now detect a LAME tag and remove any leading delay and trailing padding. So, as long as there is a LAME tag in the imported MP3 file, the tracks should not include any unwanted silence at the beginning or end (unless the audio actually had itβs own silence).
Nice to see you using my GuardedCall :-) You understand the purpose of that: catch our C++ exceptions that might happen during our callbacks (such as we exhaust drive space trying to save the import data in our block files, so we throw), so they don't unwind across the stack frames of the lib mad library that are compiled in C and might not clean up properly. Translate exceptions to error codes instead. We DON'T clue the user in the case that this is the reason for import failure. We MIGHT do that by saving the exception, using std::exception_ptr, then exiting MAD, then re-throwing the saved exception at a safe place, then letting the application-level catch give the user a message dialog. You can see where I did use exception_ptr in Nyquist and SBSMSEffect. I never got around to making analogous code in Import. exception_ptr wasn't available when I did that on Mac, because we were still constrained to use C++98 standard library.
(In reply to Paul L from comment #32) > Nice to see you using my GuardedCall :-) > > You understand the purpose of that: catch our C++ exceptions that might > happen during our callbacks (such as we exhaust drive space trying to save > the import data in our block files, so we throw), so they don't unwind > across the stack frames of the lib mad library that are compiled in C and > might not clean up properly. Translate exceptions to error codes instead. > > We DON'T clue the user in the case that this is the reason for import > failure. We MIGHT do that by saving the exception, using > std::exception_ptr, then exiting MAD, then re-throwing the saved exception > at a safe place, then letting the application-level catch give the user a > message dialog. > So, should I remove the AudacityMessageBox() call in MP3ImportFileHandle::ErrorCB() to a throw and then display the message once libmad returns from the failed callback?
Testing on W10 with Audacity 2.4.2 35e88d9 I get no progress bars (step 6) - I don't get a freeze (step 5) What I do get is the following error message: >MP3 Decoding failed >forbidden bit allocation value Just what is the second part this message meant to explain to the average non-propellor head user (including me that is). The underlying issue is that they are trying to import an file that is masquerading as am MP3 but is not in fact an MP3 or a properly formulated MP3. Surely it would be much better if we could tell them that. So while this fixes the freeze and doesn't lock up Audacity (as in Step 8) I have marked this as "?" as the message is, for me at least, a residual issue.
(In reply to Peter Sampson from comment #34) Testing on macOS 10.15.4 with Audacity 2.4.2 35e88d9 I get the same opaque error message
After discussing this with the RM - I have added P3 Bug #2445 for the residula of the cryptic error message. Accordingly we can close this off as RESOLVED
Thank you Leland for finally getting rid of the crap at the start and end of MP3 files! This has been bugging me for years.
(In reply to Leland Lucius from comment #31) >In addition, it will now detect a LAME tag and remove any leading >delay and trailing padding. Confirmed by testing on W10 with Audacity 2.4.2 00084a8 And let me echo Bill's praise and thanks to Leland in Comment #37 where he wrote: >This has been bugging me for years. It has been bugging very many of our users for years. And we have been telling them that is a "feature of MP3" - when all the time it could have been in our hands to fix it. Great work Leland.
(In reply to Bill Wharrie from comment #37) This is sufficeently important, I think, that I added a note about it in the alpha Manual for 2.4.2 on the "What's new..." page. See: https://alphamanual.audacityteam.org/man/New_features_in_this_release#MP3_exports_without_padding
Thanks guys, but remember, it won't work for all MP3s. Only ones that include a LAME tag. I've found a bunch of older MP3s that do not have it. Some MP3 decoders go so far as to scan and trim silence from MP3s, but I didn't want to go there. :-)
(In reply to Leland Lucius from comment #33) Oh, there is error_cb. I didn't read every detail of the commit. No, do NOT throw while you are in error_cb. That is exactly the thing to avoid, as I explained: throwing a C++ exception while inside a callback from the library, which means there are levels of C in the stack above you. I presume you tested cases of malformed files where error_cb really is visited, and that it's safe to put up the message box. (Would not be if you are not in the main thread.) Has anyone also tested import in the case of exhaustion of space on a thumb drive, for useful error messages? To make that work may need some help from me in proper use of exception_ptr.
(In reply to Paul L from comment #41) > (In reply to Leland Lucius from comment #33) > > Oh, there is error_cb. I didn't read every detail of the commit. > > No, do NOT throw while you are in error_cb. That is exactly the thing to > avoid, as I explained: throwing a C++ exception while inside a callback > from the library, which means there are levels of C in the stack above you. > I didn't think so, but I figured I'd ask. > I presume you tested cases of malformed files where error_cb really is > visited, and that it's safe to put up the message box. (Would not be if you > are not in the main thread.) > Yepper. It's actually fairly easy as libmad has a bunch of checks for correctness and he's been TRYING to let us know...we've just been ignoring him.
(In reply to Paul L from comment #41) >Has anyone also tested import in the case of exhaustion of space on >a thumb drive, for useful error messages? I tested for this following a Forum report of this last week. There is no error message Audacity crashes in this circumstance. I logged a new bug for this Bug #2442 >Crash when importing (or editing) to a disk with insufficent disk space available