DOSBox-X
|
00001 /* 00002 FLAC audio decoder. Choice of public domain or MIT-0. See license statements at the end of this file. 00003 dr_flac - v0.12.13 - 2020-05-16 00004 00005 David Reid - mackron@gmail.com 00006 00007 GitHub: https://github.com/mackron/dr_libs 00008 */ 00009 00010 /* 00011 RELEASE NOTES - v0.12.0 00012 ======================= 00013 Version 0.12.0 has breaking API changes including changes to the existing API and the removal of deprecated APIs. 00014 00015 00016 Improved Client-Defined Memory Allocation 00017 ----------------------------------------- 00018 The main change with this release is the addition of a more flexible way of implementing custom memory allocation routines. The 00019 existing system of DRFLAC_MALLOC, DRFLAC_REALLOC and DRFLAC_FREE are still in place and will be used by default when no custom 00020 allocation callbacks are specified. 00021 00022 To use the new system, you pass in a pointer to a drflac_allocation_callbacks object to drflac_open() and family, like this: 00023 00024 void* my_malloc(size_t sz, void* pUserData) 00025 { 00026 return malloc(sz); 00027 } 00028 void* my_realloc(void* p, size_t sz, void* pUserData) 00029 { 00030 return realloc(p, sz); 00031 } 00032 void my_free(void* p, void* pUserData) 00033 { 00034 free(p); 00035 } 00036 00037 ... 00038 00039 drflac_allocation_callbacks allocationCallbacks; 00040 allocationCallbacks.pUserData = &myData; 00041 allocationCallbacks.onMalloc = my_malloc; 00042 allocationCallbacks.onRealloc = my_realloc; 00043 allocationCallbacks.onFree = my_free; 00044 drflac* pFlac = drflac_open_file("my_file.flac", &allocationCallbacks); 00045 00046 The advantage of this new system is that it allows you to specify user data which will be passed in to the allocation routines. 00047 00048 Passing in null for the allocation callbacks object will cause dr_flac to use defaults which is the same as DRFLAC_MALLOC, 00049 DRFLAC_REALLOC and DRFLAC_FREE and the equivalent of how it worked in previous versions. 00050 00051 Every API that opens a drflac object now takes this extra parameter. These include the following: 00052 00053 drflac_open() 00054 drflac_open_relaxed() 00055 drflac_open_with_metadata() 00056 drflac_open_with_metadata_relaxed() 00057 drflac_open_file() 00058 drflac_open_file_with_metadata() 00059 drflac_open_memory() 00060 drflac_open_memory_with_metadata() 00061 drflac_open_and_read_pcm_frames_s32() 00062 drflac_open_and_read_pcm_frames_s16() 00063 drflac_open_and_read_pcm_frames_f32() 00064 drflac_open_file_and_read_pcm_frames_s32() 00065 drflac_open_file_and_read_pcm_frames_s16() 00066 drflac_open_file_and_read_pcm_frames_f32() 00067 drflac_open_memory_and_read_pcm_frames_s32() 00068 drflac_open_memory_and_read_pcm_frames_s16() 00069 drflac_open_memory_and_read_pcm_frames_f32() 00070 00071 00072 00073 Optimizations 00074 ------------- 00075 Seeking performance has been greatly improved. A new binary search based seeking algorithm has been introduced which significantly 00076 improves performance over the brute force method which was used when no seek table was present. Seek table based seeking also takes 00077 advantage of the new binary search seeking system to further improve performance there as well. Note that this depends on CRC which 00078 means it will be disabled when DR_FLAC_NO_CRC is used. 00079 00080 The SSE4.1 pipeline has been cleaned up and optimized. You should see some improvements with decoding speed of 24-bit files in 00081 particular. 16-bit streams should also see some improvement. 00082 00083 drflac_read_pcm_frames_s16() has been optimized. Previously this sat on top of drflac_read_pcm_frames_s32() and performed it's s32 00084 to s16 conversion in a second pass. This is now all done in a single pass. This includes SSE2 and ARM NEON optimized paths. 00085 00086 A minor optimization has been implemented for drflac_read_pcm_frames_s32(). This will now use an SSE2 optimized pipeline for stereo 00087 channel reconstruction which is the last part of the decoding process. 00088 00089 The ARM build has seen a few improvements. The CLZ (count leading zeroes) and REV (byte swap) instructions are now used when 00090 compiling with GCC and Clang which is achieved using inline assembly. The CLZ instruction requires ARM architecture version 5 at 00091 compile time and the REV instruction requires ARM architecture version 6. 00092 00093 An ARM NEON optimized pipeline has been implemented. To enable this you'll need to add -mfpu=neon to the command line when compiling. 00094 00095 00096 Removed APIs 00097 ------------ 00098 The following APIs were deprecated in version 0.11.0 and have been completely removed in version 0.12.0: 00099 00100 drflac_read_s32() -> drflac_read_pcm_frames_s32() 00101 drflac_read_s16() -> drflac_read_pcm_frames_s16() 00102 drflac_read_f32() -> drflac_read_pcm_frames_f32() 00103 drflac_seek_to_sample() -> drflac_seek_to_pcm_frame() 00104 drflac_open_and_decode_s32() -> drflac_open_and_read_pcm_frames_s32() 00105 drflac_open_and_decode_s16() -> drflac_open_and_read_pcm_frames_s16() 00106 drflac_open_and_decode_f32() -> drflac_open_and_read_pcm_frames_f32() 00107 drflac_open_and_decode_file_s32() -> drflac_open_file_and_read_pcm_frames_s32() 00108 drflac_open_and_decode_file_s16() -> drflac_open_file_and_read_pcm_frames_s16() 00109 drflac_open_and_decode_file_f32() -> drflac_open_file_and_read_pcm_frames_f32() 00110 drflac_open_and_decode_memory_s32() -> drflac_open_memory_and_read_pcm_frames_s32() 00111 drflac_open_and_decode_memory_s16() -> drflac_open_memory_and_read_pcm_frames_s16() 00112 drflac_open_and_decode_memory_f32() -> drflac_open_memroy_and_read_pcm_frames_f32() 00113 00114 Prior versions of dr_flac operated on a per-sample basis whereas now it operates on PCM frames. The removed APIs all relate 00115 to the old per-sample APIs. You now need to use the "pcm_frame" versions. 00116 */ 00117 00118 00119 /* 00120 Introduction 00121 ============ 00122 dr_flac is a single file library. To use it, do something like the following in one .c file. 00123 00124 ```c 00125 #define DR_FLAC_IMPLEMENTATION 00126 #include "dr_flac.h" 00127 ``` 00128 00129 You can then #include this file in other parts of the program as you would with any other header file. To decode audio data, do something like the following: 00130 00131 ```c 00132 drflac* pFlac = drflac_open_file("MySong.flac", NULL); 00133 if (pFlac == NULL) { 00134 // Failed to open FLAC file 00135 } 00136 00137 drflac_int32* pSamples = malloc(pFlac->totalPCMFrameCount * pFlac->channels * sizeof(drflac_int32)); 00138 drflac_uint64 numberOfInterleavedSamplesActuallyRead = drflac_read_pcm_frames_s32(pFlac, pFlac->totalPCMFrameCount, pSamples); 00139 ``` 00140 00141 The drflac object represents the decoder. It is a transparent type so all the information you need, such as the number of channels and the bits per sample, 00142 should be directly accessible - just make sure you don't change their values. Samples are always output as interleaved signed 32-bit PCM. In the example above 00143 a native FLAC stream was opened, however dr_flac has seamless support for Ogg encapsulated FLAC streams as well. 00144 00145 You do not need to decode the entire stream in one go - you just specify how many samples you'd like at any given time and the decoder will give you as many 00146 samples as it can, up to the amount requested. Later on when you need the next batch of samples, just call it again. Example: 00147 00148 ```c 00149 while (drflac_read_pcm_frames_s32(pFlac, chunkSizeInPCMFrames, pChunkSamples) > 0) { 00150 do_something(); 00151 } 00152 ``` 00153 00154 You can seek to a specific PCM frame with `drflac_seek_to_pcm_frame()`. 00155 00156 If you just want to quickly decode an entire FLAC file in one go you can do something like this: 00157 00158 ```c 00159 unsigned int channels; 00160 unsigned int sampleRate; 00161 drflac_uint64 totalPCMFrameCount; 00162 drflac_int32* pSampleData = drflac_open_file_and_read_pcm_frames_s32("MySong.flac", &channels, &sampleRate, &totalPCMFrameCount, NULL); 00163 if (pSampleData == NULL) { 00164 // Failed to open and decode FLAC file. 00165 } 00166 00167 ... 00168 00169 drflac_free(pSampleData); 00170 ``` 00171 00172 You can read samples as signed 16-bit integer and 32-bit floating-point PCM with the *_s16() and *_f32() family of APIs respectively, but note that these 00173 should be considered lossy. 00174 00175 00176 If you need access to metadata (album art, etc.), use `drflac_open_with_metadata()`, `drflac_open_file_with_metdata()` or `drflac_open_memory_with_metadata()`. 00177 The rationale for keeping these APIs separate is that they're slightly slower than the normal versions and also just a little bit harder to use. dr_flac 00178 reports metadata to the application through the use of a callback, and every metadata block is reported before `drflac_open_with_metdata()` returns. 00179 00180 The main opening APIs (`drflac_open()`, etc.) will fail if the header is not present. The presents a problem in certain scenarios such as broadcast style 00181 streams or internet radio where the header may not be present because the user has started playback mid-stream. To handle this, use the relaxed APIs: 00182 00183 `drflac_open_relaxed()` 00184 `drflac_open_with_metadata_relaxed()` 00185 00186 It is not recommended to use these APIs for file based streams because a missing header would usually indicate a corrupt or perverse file. In addition, these 00187 APIs can take a long time to initialize because they may need to spend a lot of time finding the first frame. 00188 00189 00190 00191 Build Options 00192 ============= 00193 #define these options before including this file. 00194 00195 #define DR_FLAC_NO_STDIO 00196 Disable `drflac_open_file()` and family. 00197 00198 #define DR_FLAC_NO_OGG 00199 Disables support for Ogg/FLAC streams. 00200 00201 #define DR_FLAC_BUFFER_SIZE <number> 00202 Defines the size of the internal buffer to store data from onRead(). This buffer is used to reduce the number of calls back to the client for more data. 00203 Larger values means more memory, but better performance. My tests show diminishing returns after about 4KB (which is the default). Consider reducing this if 00204 you have a very efficient implementation of onRead(), or increase it if it's very inefficient. Must be a multiple of 8. 00205 00206 #define DR_FLAC_NO_CRC 00207 Disables CRC checks. This will offer a performance boost when CRC is unnecessary. This will disable binary search seeking. When seeking, the seek table will 00208 be used if available. Otherwise the seek will be performed using brute force. 00209 00210 #define DR_FLAC_NO_SIMD 00211 Disables SIMD optimizations (SSE on x86/x64 architectures, NEON on ARM architectures). Use this if you are having compatibility issues with your compiler. 00212 00213 00214 00215 Notes 00216 ===== 00217 - dr_flac does not support changing the sample rate nor channel count mid stream. 00218 - dr_flac is not thread-safe, but its APIs can be called from any thread so long as you do your own synchronization. 00219 - When using Ogg encapsulation, a corrupted metadata block will result in `drflac_open_with_metadata()` and `drflac_open()` returning inconsistent samples due 00220 to differences in corrupted stream recorvery logic between the two APIs. 00221 */ 00222 00223 #ifndef dr_flac_h 00224 #define dr_flac_h 00225 00226 #ifdef __cplusplus 00227 extern "C" { 00228 #endif 00229 00230 #define DRFLAC_STRINGIFY(x) #x 00231 #define DRFLAC_XSTRINGIFY(x) DRFLAC_STRINGIFY(x) 00232 00233 #define DRFLAC_VERSION_MAJOR 0 00234 #define DRFLAC_VERSION_MINOR 12 00235 #define DRFLAC_VERSION_REVISION 13 00236 #define DRFLAC_VERSION_STRING DRFLAC_XSTRINGIFY(DRFLAC_VERSION_MAJOR) "." DRFLAC_XSTRINGIFY(DRFLAC_VERSION_MINOR) "." DRFLAC_XSTRINGIFY(DRFLAC_VERSION_REVISION) 00237 00238 #include <stddef.h> /* For size_t. */ 00239 00240 /* Sized types. Prefer built-in types. Fall back to stdint. */ 00241 #ifdef _MSC_VER 00242 #if defined(__clang__) 00243 #pragma GCC diagnostic push 00244 #pragma GCC diagnostic ignored "-Wlanguage-extension-token" 00245 #pragma GCC diagnostic ignored "-Wlong-long" 00246 #pragma GCC diagnostic ignored "-Wc++11-long-long" 00247 #endif 00248 typedef signed __int8 drflac_int8; 00249 typedef unsigned __int8 drflac_uint8; 00250 typedef signed __int16 drflac_int16; 00251 typedef unsigned __int16 drflac_uint16; 00252 typedef signed __int32 drflac_int32; 00253 typedef unsigned __int32 drflac_uint32; 00254 typedef signed __int64 drflac_int64; 00255 typedef unsigned __int64 drflac_uint64; 00256 #if defined(__clang__) 00257 #pragma GCC diagnostic pop 00258 #endif 00259 #else 00260 #include <stdint.h> 00261 typedef int8_t drflac_int8; 00262 typedef uint8_t drflac_uint8; 00263 typedef int16_t drflac_int16; 00264 typedef uint16_t drflac_uint16; 00265 typedef int32_t drflac_int32; 00266 typedef uint32_t drflac_uint32; 00267 typedef int64_t drflac_int64; 00268 typedef uint64_t drflac_uint64; 00269 #endif 00270 typedef drflac_uint8 drflac_bool8; 00271 typedef drflac_uint32 drflac_bool32; 00272 #define DRFLAC_TRUE 1 00273 #define DRFLAC_FALSE 0 00274 00275 #if !defined(DRFLAC_API) 00276 #if defined(DRFLAC_DLL) 00277 #if defined(_WIN32) 00278 #define DRFLAC_DLL_IMPORT __declspec(dllimport) 00279 #define DRFLAC_DLL_EXPORT __declspec(dllexport) 00280 #define DRFLAC_DLL_PRIVATE static 00281 #else 00282 #if defined(__GNUC__) && __GNUC__ >= 4 00283 #define DRFLAC_DLL_IMPORT __attribute__((visibility("default"))) 00284 #define DRFLAC_DLL_EXPORT __attribute__((visibility("default"))) 00285 #define DRFLAC_DLL_PRIVATE __attribute__((visibility("hidden"))) 00286 #else 00287 #define DRFLAC_DLL_IMPORT 00288 #define DRFLAC_DLL_EXPORT 00289 #define DRFLAC_DLL_PRIVATE static 00290 #endif 00291 #endif 00292 00293 #if defined(DR_FLAC_IMPLEMENTATION) || defined(DRFLAC_IMPLEMENTATION) 00294 #define DRFLAC_API DRFLAC_DLL_EXPORT 00295 #else 00296 #define DRFLAC_API DRFLAC_DLL_IMPORT 00297 #endif 00298 #define DRFLAC_PRIVATE DRFLAC_DLL_PRIVATE 00299 #else 00300 #define DRFLAC_API extern 00301 #define DRFLAC_PRIVATE static 00302 #endif 00303 #endif 00304 00305 #if defined(_MSC_VER) && _MSC_VER >= 1700 /* Visual Studio 2012 */ 00306 #define DRFLAC_DEPRECATED __declspec(deprecated) 00307 #elif (defined(__GNUC__) && __GNUC__ >= 4) /* GCC 4 */ 00308 #define DRFLAC_DEPRECATED __attribute__((deprecated)) 00309 #elif defined(__has_feature) /* Clang */ 00310 #if __has_feature(attribute_deprecated) 00311 #define DRFLAC_DEPRECATED __attribute__((deprecated)) 00312 #else 00313 #define DRFLAC_DEPRECATED 00314 #endif 00315 #else 00316 #define DRFLAC_DEPRECATED 00317 #endif 00318 00319 DRFLAC_API void drflac_version(drflac_uint32* pMajor, drflac_uint32* pMinor, drflac_uint32* pRevision); 00320 DRFLAC_API const char* drflac_version_string(); 00321 00322 /* 00323 As data is read from the client it is placed into an internal buffer for fast access. This controls the size of that buffer. Larger values means more speed, 00324 but also more memory. In my testing there is diminishing returns after about 4KB, but you can fiddle with this to suit your own needs. Must be a multiple of 8. 00325 */ 00326 #ifndef DR_FLAC_BUFFER_SIZE 00327 #define DR_FLAC_BUFFER_SIZE 4096 00328 #endif 00329 00330 /* Check if we can enable 64-bit optimizations. */ 00331 #if defined(_WIN64) || defined(_LP64) || defined(__LP64__) 00332 #define DRFLAC_64BIT 00333 #endif 00334 00335 #ifdef DRFLAC_64BIT 00336 typedef drflac_uint64 drflac_cache_t; 00337 #else 00338 typedef drflac_uint32 drflac_cache_t; 00339 #endif 00340 00341 /* The various metadata block types. */ 00342 #define DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO 0 00343 #define DRFLAC_METADATA_BLOCK_TYPE_PADDING 1 00344 #define DRFLAC_METADATA_BLOCK_TYPE_APPLICATION 2 00345 #define DRFLAC_METADATA_BLOCK_TYPE_SEEKTABLE 3 00346 #define DRFLAC_METADATA_BLOCK_TYPE_VORBIS_COMMENT 4 00347 #define DRFLAC_METADATA_BLOCK_TYPE_CUESHEET 5 00348 #define DRFLAC_METADATA_BLOCK_TYPE_PICTURE 6 00349 #define DRFLAC_METADATA_BLOCK_TYPE_INVALID 127 00350 00351 /* The various picture types specified in the PICTURE block. */ 00352 #define DRFLAC_PICTURE_TYPE_OTHER 0 00353 #define DRFLAC_PICTURE_TYPE_FILE_ICON 1 00354 #define DRFLAC_PICTURE_TYPE_OTHER_FILE_ICON 2 00355 #define DRFLAC_PICTURE_TYPE_COVER_FRONT 3 00356 #define DRFLAC_PICTURE_TYPE_COVER_BACK 4 00357 #define DRFLAC_PICTURE_TYPE_LEAFLET_PAGE 5 00358 #define DRFLAC_PICTURE_TYPE_MEDIA 6 00359 #define DRFLAC_PICTURE_TYPE_LEAD_ARTIST 7 00360 #define DRFLAC_PICTURE_TYPE_ARTIST 8 00361 #define DRFLAC_PICTURE_TYPE_CONDUCTOR 9 00362 #define DRFLAC_PICTURE_TYPE_BAND 10 00363 #define DRFLAC_PICTURE_TYPE_COMPOSER 11 00364 #define DRFLAC_PICTURE_TYPE_LYRICIST 12 00365 #define DRFLAC_PICTURE_TYPE_RECORDING_LOCATION 13 00366 #define DRFLAC_PICTURE_TYPE_DURING_RECORDING 14 00367 #define DRFLAC_PICTURE_TYPE_DURING_PERFORMANCE 15 00368 #define DRFLAC_PICTURE_TYPE_SCREEN_CAPTURE 16 00369 #define DRFLAC_PICTURE_TYPE_BRIGHT_COLORED_FISH 17 00370 #define DRFLAC_PICTURE_TYPE_ILLUSTRATION 18 00371 #define DRFLAC_PICTURE_TYPE_BAND_LOGOTYPE 19 00372 #define DRFLAC_PICTURE_TYPE_PUBLISHER_LOGOTYPE 20 00373 00374 typedef enum 00375 { 00376 drflac_container_native, 00377 drflac_container_ogg, 00378 drflac_container_unknown 00379 } drflac_container; 00380 00381 typedef enum 00382 { 00383 drflac_seek_origin_start, 00384 drflac_seek_origin_current 00385 } drflac_seek_origin; 00386 00387 /* Packing is important on this structure because we map this directly to the raw data within the SEEKTABLE metadata block. */ 00388 #pragma pack(2) 00389 typedef struct 00390 { 00391 drflac_uint64 firstPCMFrame; 00392 drflac_uint64 flacFrameOffset; /* The offset from the first byte of the header of the first frame. */ 00393 drflac_uint16 pcmFrameCount; 00394 } drflac_seekpoint; 00395 #pragma pack() 00396 00397 typedef struct 00398 { 00399 drflac_uint16 minBlockSizeInPCMFrames; 00400 drflac_uint16 maxBlockSizeInPCMFrames; 00401 drflac_uint32 minFrameSizeInPCMFrames; 00402 drflac_uint32 maxFrameSizeInPCMFrames; 00403 drflac_uint32 sampleRate; 00404 drflac_uint8 channels; 00405 drflac_uint8 bitsPerSample; 00406 drflac_uint64 totalPCMFrameCount; 00407 drflac_uint8 md5[16]; 00408 } drflac_streaminfo; 00409 00410 typedef struct 00411 { 00412 /* The metadata type. Use this to know how to interpret the data below. */ 00413 drflac_uint32 type; 00414 00415 /* 00416 A pointer to the raw data. This points to a temporary buffer so don't hold on to it. It's best to 00417 not modify the contents of this buffer. Use the structures below for more meaningful and structured 00418 information about the metadata. It's possible for this to be null. 00419 */ 00420 const void* pRawData; 00421 00422 /* The size in bytes of the block and the buffer pointed to by pRawData if it's non-NULL. */ 00423 drflac_uint32 rawDataSize; 00424 00425 union 00426 { 00427 drflac_streaminfo streaminfo; 00428 00429 struct 00430 { 00431 int unused; 00432 } padding; 00433 00434 struct 00435 { 00436 drflac_uint32 id; 00437 const void* pData; 00438 drflac_uint32 dataSize; 00439 } application; 00440 00441 struct 00442 { 00443 drflac_uint32 seekpointCount; 00444 const drflac_seekpoint* pSeekpoints; 00445 } seektable; 00446 00447 struct 00448 { 00449 drflac_uint32 vendorLength; 00450 const char* vendor; 00451 drflac_uint32 commentCount; 00452 const void* pComments; 00453 } vorbis_comment; 00454 00455 struct 00456 { 00457 char catalog[128]; 00458 drflac_uint64 leadInSampleCount; 00459 drflac_bool32 isCD; 00460 drflac_uint8 trackCount; 00461 const void* pTrackData; 00462 } cuesheet; 00463 00464 struct 00465 { 00466 drflac_uint32 type; 00467 drflac_uint32 mimeLength; 00468 const char* mime; 00469 drflac_uint32 descriptionLength; 00470 const char* description; 00471 drflac_uint32 width; 00472 drflac_uint32 height; 00473 drflac_uint32 colorDepth; 00474 drflac_uint32 indexColorCount; 00475 drflac_uint32 pictureDataSize; 00476 const drflac_uint8* pPictureData; 00477 } picture; 00478 } data; 00479 } drflac_metadata; 00480 00481 00482 /* 00483 Callback for when data needs to be read from the client. 00484 00485 00486 Parameters 00487 ---------- 00488 pUserData (in) 00489 The user data that was passed to drflac_open() and family. 00490 00491 pBufferOut (out) 00492 The output buffer. 00493 00494 bytesToRead (in) 00495 The number of bytes to read. 00496 00497 00498 Return Value 00499 ------------ 00500 The number of bytes actually read. 00501 00502 00503 Remarks 00504 ------- 00505 A return value of less than bytesToRead indicates the end of the stream. Do _not_ return from this callback until either the entire bytesToRead is filled or 00506 you have reached the end of the stream. 00507 */ 00508 typedef size_t (* drflac_read_proc)(void* pUserData, void* pBufferOut, size_t bytesToRead); 00509 00510 /* 00511 Callback for when data needs to be seeked. 00512 00513 00514 Parameters 00515 ---------- 00516 pUserData (in) 00517 The user data that was passed to drflac_open() and family. 00518 00519 offset (in) 00520 The number of bytes to move, relative to the origin. Will never be negative. 00521 00522 origin (in) 00523 The origin of the seek - the current position or the start of the stream. 00524 00525 00526 Return Value 00527 ------------ 00528 Whether or not the seek was successful. 00529 00530 00531 Remarks 00532 ------- 00533 The offset will never be negative. Whether or not it is relative to the beginning or current position is determined by the "origin" parameter which will be 00534 either drflac_seek_origin_start or drflac_seek_origin_current. 00535 00536 When seeking to a PCM frame using drflac_seek_to_pcm_frame(), dr_flac may call this with an offset beyond the end of the FLAC stream. This needs to be detected 00537 and handled by returning DRFLAC_FALSE. 00538 */ 00539 typedef drflac_bool32 (* drflac_seek_proc)(void* pUserData, int offset, drflac_seek_origin origin); 00540 00541 /* 00542 Callback for when a metadata block is read. 00543 00544 00545 Parameters 00546 ---------- 00547 pUserData (in) 00548 The user data that was passed to drflac_open() and family. 00549 00550 pMetadata (in) 00551 A pointer to a structure containing the data of the metadata block. 00552 00553 00554 Remarks 00555 ------- 00556 Use pMetadata->type to determine which metadata block is being handled and how to read the data. 00557 */ 00558 typedef void (* drflac_meta_proc)(void* pUserData, drflac_metadata* pMetadata); 00559 00560 00561 typedef struct 00562 { 00563 void* pUserData; 00564 void* (* onMalloc)(size_t sz, void* pUserData); 00565 void* (* onRealloc)(void* p, size_t sz, void* pUserData); 00566 void (* onFree)(void* p, void* pUserData); 00567 } drflac_allocation_callbacks; 00568 00569 /* Structure for internal use. Only used for decoders opened with drflac_open_memory. */ 00570 typedef struct 00571 { 00572 const drflac_uint8* data; 00573 size_t dataSize; 00574 size_t currentReadPos; 00575 } drflac__memory_stream; 00576 00577 /* Structure for internal use. Used for bit streaming. */ 00578 typedef struct 00579 { 00580 /* The function to call when more data needs to be read. */ 00581 drflac_read_proc onRead; 00582 00583 /* The function to call when the current read position needs to be moved. */ 00584 drflac_seek_proc onSeek; 00585 00586 /* The user data to pass around to onRead and onSeek. */ 00587 void* pUserData; 00588 00589 00590 /* 00591 The number of unaligned bytes in the L2 cache. This will always be 0 until the end of the stream is hit. At the end of the 00592 stream there will be a number of bytes that don't cleanly fit in an L1 cache line, so we use this variable to know whether 00593 or not the bistreamer needs to run on a slower path to read those last bytes. This will never be more than sizeof(drflac_cache_t). 00594 */ 00595 size_t unalignedByteCount; 00596 00597 /* The content of the unaligned bytes. */ 00598 drflac_cache_t unalignedCache; 00599 00600 /* The index of the next valid cache line in the "L2" cache. */ 00601 drflac_uint32 nextL2Line; 00602 00603 /* The number of bits that have been consumed by the cache. This is used to determine how many valid bits are remaining. */ 00604 drflac_uint32 consumedBits; 00605 00606 /* 00607 The cached data which was most recently read from the client. There are two levels of cache. Data flows as such: 00608 Client -> L2 -> L1. The L2 -> L1 movement is aligned and runs on a fast path in just a few instructions. 00609 */ 00610 drflac_cache_t cacheL2[DR_FLAC_BUFFER_SIZE/sizeof(drflac_cache_t)]; 00611 drflac_cache_t cache; 00612 00613 /* 00614 CRC-16. This is updated whenever bits are read from the bit stream. Manually set this to 0 to reset the CRC. For FLAC, this 00615 is reset to 0 at the beginning of each frame. 00616 */ 00617 drflac_uint16 crc16; 00618 drflac_cache_t crc16Cache; /* A cache for optimizing CRC calculations. This is filled when when the L1 cache is reloaded. */ 00619 drflac_uint32 crc16CacheIgnoredBytes; /* The number of bytes to ignore when updating the CRC-16 from the CRC-16 cache. */ 00620 } drflac_bs; 00621 00622 typedef struct 00623 { 00624 /* The type of the subframe: SUBFRAME_CONSTANT, SUBFRAME_VERBATIM, SUBFRAME_FIXED or SUBFRAME_LPC. */ 00625 drflac_uint8 subframeType; 00626 00627 /* The number of wasted bits per sample as specified by the sub-frame header. */ 00628 drflac_uint8 wastedBitsPerSample; 00629 00630 /* The order to use for the prediction stage for SUBFRAME_FIXED and SUBFRAME_LPC. */ 00631 drflac_uint8 lpcOrder; 00632 00633 /* A pointer to the buffer containing the decoded samples in the subframe. This pointer is an offset from drflac::pExtraData. */ 00634 drflac_int32* pSamplesS32; 00635 } drflac_subframe; 00636 00637 typedef struct 00638 { 00639 /* 00640 If the stream uses variable block sizes, this will be set to the index of the first PCM frame. If fixed block sizes are used, this will 00641 always be set to 0. This is 64-bit because the decoded PCM frame number will be 36 bits. 00642 */ 00643 drflac_uint64 pcmFrameNumber; 00644 00645 /* 00646 If the stream uses fixed block sizes, this will be set to the frame number. If variable block sizes are used, this will always be 0. This 00647 is 32-bit because in fixed block sizes, the maximum frame number will be 31 bits. 00648 */ 00649 drflac_uint32 flacFrameNumber; 00650 00651 /* The sample rate of this frame. */ 00652 drflac_uint32 sampleRate; 00653 00654 /* The number of PCM frames in each sub-frame within this frame. */ 00655 drflac_uint16 blockSizeInPCMFrames; 00656 00657 /* 00658 The channel assignment of this frame. This is not always set to the channel count. If interchannel decorrelation is being used this 00659 will be set to DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE, DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE or DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE. 00660 */ 00661 drflac_uint8 channelAssignment; 00662 00663 /* The number of bits per sample within this frame. */ 00664 drflac_uint8 bitsPerSample; 00665 00666 /* The frame's CRC. */ 00667 drflac_uint8 crc8; 00668 } drflac_frame_header; 00669 00670 typedef struct 00671 { 00672 /* The header. */ 00673 drflac_frame_header header; 00674 00675 /* 00676 The number of PCM frames left to be read in this FLAC frame. This is initially set to the block size. As PCM frames are read, 00677 this will be decremented. When it reaches 0, the decoder will see this frame as fully consumed and load the next frame. 00678 */ 00679 drflac_uint32 pcmFramesRemaining; 00680 00681 /* The list of sub-frames within the frame. There is one sub-frame for each channel, and there's a maximum of 8 channels. */ 00682 drflac_subframe subframes[8]; 00683 } drflac_frame; 00684 00685 typedef struct 00686 { 00687 /* The function to call when a metadata block is read. */ 00688 drflac_meta_proc onMeta; 00689 00690 /* The user data posted to the metadata callback function. */ 00691 void* pUserDataMD; 00692 00693 /* Memory allocation callbacks. */ 00694 drflac_allocation_callbacks allocationCallbacks; 00695 00696 00697 /* The sample rate. Will be set to something like 44100. */ 00698 drflac_uint32 sampleRate; 00699 00700 /* 00701 The number of channels. This will be set to 1 for monaural streams, 2 for stereo, etc. Maximum 8. This is set based on the 00702 value specified in the STREAMINFO block. 00703 */ 00704 drflac_uint8 channels; 00705 00706 /* The bits per sample. Will be set to something like 16, 24, etc. */ 00707 drflac_uint8 bitsPerSample; 00708 00709 /* The maximum block size, in samples. This number represents the number of samples in each channel (not combined). */ 00710 drflac_uint16 maxBlockSizeInPCMFrames; 00711 00712 /* 00713 The total number of PCM Frames making up the stream. Can be 0 in which case it's still a valid stream, but just means 00714 the total PCM frame count is unknown. Likely the case with streams like internet radio. 00715 */ 00716 drflac_uint64 totalPCMFrameCount; 00717 00718 00719 /* The container type. This is set based on whether or not the decoder was opened from a native or Ogg stream. */ 00720 drflac_container container; 00721 00722 /* The number of seekpoints in the seektable. */ 00723 drflac_uint32 seekpointCount; 00724 00725 00726 /* Information about the frame the decoder is currently sitting on. */ 00727 drflac_frame currentFLACFrame; 00728 00729 00730 /* The index of the PCM frame the decoder is currently sitting on. This is only used for seeking. */ 00731 drflac_uint64 currentPCMFrame; 00732 00733 /* The position of the first FLAC frame in the stream. This is only ever used for seeking. */ 00734 drflac_uint64 firstFLACFramePosInBytes; 00735 00736 00737 /* A hack to avoid a malloc() when opening a decoder with drflac_open_memory(). */ 00738 drflac__memory_stream memoryStream; 00739 00740 00741 /* A pointer to the decoded sample data. This is an offset of pExtraData. */ 00742 drflac_int32* pDecodedSamples; 00743 00744 /* A pointer to the seek table. This is an offset of pExtraData, or NULL if there is no seek table. */ 00745 drflac_seekpoint* pSeekpoints; 00746 00747 /* Internal use only. Only used with Ogg containers. Points to a drflac_oggbs object. This is an offset of pExtraData. */ 00748 void* _oggbs; 00749 00750 /* Internal use only. Used for profiling and testing different seeking modes. */ 00751 drflac_bool32 _noSeekTableSeek : 1; 00752 drflac_bool32 _noBinarySearchSeek : 1; 00753 drflac_bool32 _noBruteForceSeek : 1; 00754 00755 /* The bit streamer. The raw FLAC data is fed through this object. */ 00756 drflac_bs bs; 00757 00758 /* Variable length extra data. We attach this to the end of the object so we can avoid unnecessary mallocs. */ 00759 drflac_uint8 pExtraData[1]; 00760 } drflac; 00761 00762 00763 /* 00764 Opens a FLAC decoder. 00765 00766 00767 Parameters 00768 ---------- 00769 onRead (in) 00770 The function to call when data needs to be read from the client. 00771 00772 onSeek (in) 00773 The function to call when the read position of the client data needs to move. 00774 00775 pUserData (in, optional) 00776 A pointer to application defined data that will be passed to onRead and onSeek. 00777 00778 pAllocationCallbacks (in, optional) 00779 A pointer to application defined callbacks for managing memory allocations. 00780 00781 00782 Return Value 00783 ------------ 00784 Returns a pointer to an object representing the decoder. 00785 00786 00787 Remarks 00788 ------- 00789 Close the decoder with `drflac_close()`. 00790 00791 `pAllocationCallbacks` can be NULL in which case it will use `DRFLAC_MALLOC`, `DRFLAC_REALLOC` and `DRFLAC_FREE`. 00792 00793 This function will automatically detect whether or not you are attempting to open a native or Ogg encapsulated FLAC, both of which should work seamlessly 00794 without any manual intervention. Ogg encapsulation also works with multiplexed streams which basically means it can play FLAC encoded audio tracks in videos. 00795 00796 This is the lowest level function for opening a FLAC stream. You can also use `drflac_open_file()` and `drflac_open_memory()` to open the stream from a file or 00797 from a block of memory respectively. 00798 00799 The STREAMINFO block must be present for this to succeed. Use `drflac_open_relaxed()` to open a FLAC stream where the header may not be present. 00800 00801 00802 Seek Also 00803 --------- 00804 drflac_open_file() 00805 drflac_open_memory() 00806 drflac_open_with_metadata() 00807 drflac_close() 00808 */ 00809 DRFLAC_API drflac* drflac_open(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks); 00810 00811 /* 00812 Opens a FLAC stream with relaxed validation of the header block. 00813 00814 00815 Parameters 00816 ---------- 00817 onRead (in) 00818 The function to call when data needs to be read from the client. 00819 00820 onSeek (in) 00821 The function to call when the read position of the client data needs to move. 00822 00823 container (in) 00824 Whether or not the FLAC stream is encapsulated using standard FLAC encapsulation or Ogg encapsulation. 00825 00826 pUserData (in, optional) 00827 A pointer to application defined data that will be passed to onRead and onSeek. 00828 00829 pAllocationCallbacks (in, optional) 00830 A pointer to application defined callbacks for managing memory allocations. 00831 00832 00833 Return Value 00834 ------------ 00835 A pointer to an object representing the decoder. 00836 00837 00838 Remarks 00839 ------- 00840 The same as drflac_open(), except attempts to open the stream even when a header block is not present. 00841 00842 Because the header is not necessarily available, the caller must explicitly define the container (Native or Ogg). Do not set this to `drflac_container_unknown` 00843 as that is for internal use only. 00844 00845 Opening in relaxed mode will continue reading data from onRead until it finds a valid frame. If a frame is never found it will continue forever. To abort, 00846 force your `onRead` callback to return 0, which dr_flac will use as an indicator that the end of the stream was found. 00847 */ 00848 DRFLAC_API drflac* drflac_open_relaxed(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_container container, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks); 00849 00850 /* 00851 Opens a FLAC decoder and notifies the caller of the metadata chunks (album art, etc.). 00852 00853 00854 Parameters 00855 ---------- 00856 onRead (in) 00857 The function to call when data needs to be read from the client. 00858 00859 onSeek (in) 00860 The function to call when the read position of the client data needs to move. 00861 00862 onMeta (in) 00863 The function to call for every metadata block. 00864 00865 pUserData (in, optional) 00866 A pointer to application defined data that will be passed to onRead, onSeek and onMeta. 00867 00868 pAllocationCallbacks (in, optional) 00869 A pointer to application defined callbacks for managing memory allocations. 00870 00871 00872 Return Value 00873 ------------ 00874 A pointer to an object representing the decoder. 00875 00876 00877 Remarks 00878 ------- 00879 Close the decoder with `drflac_close()`. 00880 00881 `pAllocationCallbacks` can be NULL in which case it will use `DRFLAC_MALLOC`, `DRFLAC_REALLOC` and `DRFLAC_FREE`. 00882 00883 This is slower than `drflac_open()`, so avoid this one if you don't need metadata. Internally, this will allocate and free memory on the heap for every 00884 metadata block except for STREAMINFO and PADDING blocks. 00885 00886 The caller is notified of the metadata via the `onMeta` callback. All metadata blocks will be handled before the function returns. 00887 00888 The STREAMINFO block must be present for this to succeed. Use `drflac_open_with_metadata_relaxed()` to open a FLAC stream where the header may not be present. 00889 00890 Note that this will behave inconsistently with `drflac_open()` if the stream is an Ogg encapsulated stream and a metadata block is corrupted. This is due to 00891 the way the Ogg stream recovers from corrupted pages. When `drflac_open_with_metadata()` is being used, the open routine will try to read the contents of the 00892 metadata block, whereas `drflac_open()` will simply seek past it (for the sake of efficiency). This inconsistency can result in different samples being 00893 returned depending on whether or not the stream is being opened with metadata. 00894 00895 00896 Seek Also 00897 --------- 00898 drflac_open_file_with_metadata() 00899 drflac_open_memory_with_metadata() 00900 drflac_open() 00901 drflac_close() 00902 */ 00903 DRFLAC_API drflac* drflac_open_with_metadata(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks); 00904 00905 /* 00906 The same as drflac_open_with_metadata(), except attempts to open the stream even when a header block is not present. 00907 00908 See Also 00909 -------- 00910 drflac_open_with_metadata() 00911 drflac_open_relaxed() 00912 */ 00913 DRFLAC_API drflac* drflac_open_with_metadata_relaxed(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, drflac_container container, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks); 00914 00915 /* 00916 Closes the given FLAC decoder. 00917 00918 00919 Parameters 00920 ---------- 00921 pFlac (in) 00922 The decoder to close. 00923 00924 00925 Remarks 00926 ------- 00927 This will destroy the decoder object. 00928 00929 00930 See Also 00931 -------- 00932 drflac_open() 00933 drflac_open_with_metadata() 00934 drflac_open_file() 00935 drflac_open_file_w() 00936 drflac_open_file_with_metadata() 00937 drflac_open_file_with_metadata_w() 00938 drflac_open_memory() 00939 drflac_open_memory_with_metadata() 00940 */ 00941 DRFLAC_API void drflac_close(drflac* pFlac); 00942 00943 00944 /* 00945 Reads sample data from the given FLAC decoder, output as interleaved signed 32-bit PCM. 00946 00947 00948 Parameters 00949 ---------- 00950 pFlac (in) 00951 The decoder. 00952 00953 framesToRead (in) 00954 The number of PCM frames to read. 00955 00956 pBufferOut (out, optional) 00957 A pointer to the buffer that will receive the decoded samples. 00958 00959 00960 Return Value 00961 ------------ 00962 Returns the number of PCM frames actually read. If the return value is less than `framesToRead` it has reached the end. 00963 00964 00965 Remarks 00966 ------- 00967 pBufferOut can be null, in which case the call will act as a seek, and the return value will be the number of frames seeked. 00968 */ 00969 DRFLAC_API drflac_uint64 drflac_read_pcm_frames_s32(drflac* pFlac, drflac_uint64 framesToRead, drflac_int32* pBufferOut); 00970 00971 00972 /* 00973 Reads sample data from the given FLAC decoder, output as interleaved signed 16-bit PCM. 00974 00975 00976 Parameters 00977 ---------- 00978 pFlac (in) 00979 The decoder. 00980 00981 framesToRead (in) 00982 The number of PCM frames to read. 00983 00984 pBufferOut (out, optional) 00985 A pointer to the buffer that will receive the decoded samples. 00986 00987 00988 Return Value 00989 ------------ 00990 Returns the number of PCM frames actually read. If the return value is less than `framesToRead` it has reached the end. 00991 00992 00993 Remarks 00994 ------- 00995 pBufferOut can be null, in which case the call will act as a seek, and the return value will be the number of frames seeked. 00996 00997 Note that this is lossy for streams where the bits per sample is larger than 16. 00998 */ 00999 DRFLAC_API drflac_uint64 drflac_read_pcm_frames_s16(drflac* pFlac, drflac_uint64 framesToRead, drflac_int16* pBufferOut); 01000 01001 /* 01002 Reads sample data from the given FLAC decoder, output as interleaved 32-bit floating point PCM. 01003 01004 01005 Parameters 01006 ---------- 01007 pFlac (in) 01008 The decoder. 01009 01010 framesToRead (in) 01011 The number of PCM frames to read. 01012 01013 pBufferOut (out, optional) 01014 A pointer to the buffer that will receive the decoded samples. 01015 01016 01017 Return Value 01018 ------------ 01019 Returns the number of PCM frames actually read. If the return value is less than `framesToRead` it has reached the end. 01020 01021 01022 Remarks 01023 ------- 01024 pBufferOut can be null, in which case the call will act as a seek, and the return value will be the number of frames seeked. 01025 01026 Note that this should be considered lossy due to the nature of floating point numbers not being able to exactly represent every possible number. 01027 */ 01028 DRFLAC_API drflac_uint64 drflac_read_pcm_frames_f32(drflac* pFlac, drflac_uint64 framesToRead, float* pBufferOut); 01029 01030 /* 01031 Seeks to the PCM frame at the given index. 01032 01033 01034 Parameters 01035 ---------- 01036 pFlac (in) 01037 The decoder. 01038 01039 pcmFrameIndex (in) 01040 The index of the PCM frame to seek to. See notes below. 01041 01042 01043 Return Value 01044 ------------- 01045 `DRFLAC_TRUE` if successful; `DRFLAC_FALSE` otherwise. 01046 */ 01047 DRFLAC_API drflac_bool32 drflac_seek_to_pcm_frame(drflac* pFlac, drflac_uint64 pcmFrameIndex); 01048 01049 01050 01051 #ifndef DR_FLAC_NO_STDIO 01052 /* 01053 Opens a FLAC decoder from the file at the given path. 01054 01055 01056 Parameters 01057 ---------- 01058 pFileName (in) 01059 The path of the file to open, either absolute or relative to the current directory. 01060 01061 pAllocationCallbacks (in, optional) 01062 A pointer to application defined callbacks for managing memory allocations. 01063 01064 01065 Return Value 01066 ------------ 01067 A pointer to an object representing the decoder. 01068 01069 01070 Remarks 01071 ------- 01072 Close the decoder with drflac_close(). 01073 01074 01075 Remarks 01076 ------- 01077 This will hold a handle to the file until the decoder is closed with drflac_close(). Some platforms will restrict the number of files a process can have open 01078 at any given time, so keep this mind if you have many decoders open at the same time. 01079 01080 01081 See Also 01082 -------- 01083 drflac_open_file_with_metadata() 01084 drflac_open() 01085 drflac_close() 01086 */ 01087 DRFLAC_API drflac* drflac_open_file(const char* pFileName, const drflac_allocation_callbacks* pAllocationCallbacks); 01088 DRFLAC_API drflac* drflac_open_file_w(const wchar_t* pFileName, const drflac_allocation_callbacks* pAllocationCallbacks); 01089 01090 /* 01091 Opens a FLAC decoder from the file at the given path and notifies the caller of the metadata chunks (album art, etc.) 01092 01093 01094 Parameters 01095 ---------- 01096 pFileName (in) 01097 The path of the file to open, either absolute or relative to the current directory. 01098 01099 pAllocationCallbacks (in, optional) 01100 A pointer to application defined callbacks for managing memory allocations. 01101 01102 onMeta (in) 01103 The callback to fire for each metadata block. 01104 01105 pUserData (in) 01106 A pointer to the user data to pass to the metadata callback. 01107 01108 pAllocationCallbacks (in) 01109 A pointer to application defined callbacks for managing memory allocations. 01110 01111 01112 Remarks 01113 ------- 01114 Look at the documentation for drflac_open_with_metadata() for more information on how metadata is handled. 01115 01116 01117 See Also 01118 -------- 01119 drflac_open_with_metadata() 01120 drflac_open() 01121 drflac_close() 01122 */ 01123 DRFLAC_API drflac* drflac_open_file_with_metadata(const char* pFileName, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks); 01124 DRFLAC_API drflac* drflac_open_file_with_metadata_w(const wchar_t* pFileName, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks); 01125 #endif 01126 01127 /* 01128 Opens a FLAC decoder from a pre-allocated block of memory 01129 01130 01131 Parameters 01132 ---------- 01133 pData (in) 01134 A pointer to the raw encoded FLAC data. 01135 01136 dataSize (in) 01137 The size in bytes of `data`. 01138 01139 pAllocationCallbacks (in) 01140 A pointer to application defined callbacks for managing memory allocations. 01141 01142 01143 Return Value 01144 ------------ 01145 A pointer to an object representing the decoder. 01146 01147 01148 Remarks 01149 ------- 01150 This does not create a copy of the data. It is up to the application to ensure the buffer remains valid for the lifetime of the decoder. 01151 01152 01153 See Also 01154 -------- 01155 drflac_open() 01156 drflac_close() 01157 */ 01158 DRFLAC_API drflac* drflac_open_memory(const void* pData, size_t dataSize, const drflac_allocation_callbacks* pAllocationCallbacks); 01159 01160 /* 01161 Opens a FLAC decoder from a pre-allocated block of memory and notifies the caller of the metadata chunks (album art, etc.) 01162 01163 01164 Parameters 01165 ---------- 01166 pData (in) 01167 A pointer to the raw encoded FLAC data. 01168 01169 dataSize (in) 01170 The size in bytes of `data`. 01171 01172 onMeta (in) 01173 The callback to fire for each metadata block. 01174 01175 pUserData (in) 01176 A pointer to the user data to pass to the metadata callback. 01177 01178 pAllocationCallbacks (in) 01179 A pointer to application defined callbacks for managing memory allocations. 01180 01181 01182 Remarks 01183 ------- 01184 Look at the documentation for drflac_open_with_metadata() for more information on how metadata is handled. 01185 01186 01187 See Also 01188 ------- 01189 drflac_open_with_metadata() 01190 drflac_open() 01191 drflac_close() 01192 */ 01193 DRFLAC_API drflac* drflac_open_memory_with_metadata(const void* pData, size_t dataSize, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks); 01194 01195 01196 01197 /* High Level APIs */ 01198 01199 /* 01200 Opens a FLAC stream from the given callbacks and fully decodes it in a single operation. The return value is a 01201 pointer to the sample data as interleaved signed 32-bit PCM. The returned data must be freed with drflac_free(). 01202 01203 You can pass in custom memory allocation callbacks via the pAllocationCallbacks parameter. This can be NULL in which 01204 case it will use DRFLAC_MALLOC, DRFLAC_REALLOC and DRFLAC_FREE. 01205 01206 Sometimes a FLAC file won't keep track of the total sample count. In this situation the function will continuously 01207 read samples into a dynamically sized buffer on the heap until no samples are left. 01208 01209 Do not call this function on a broadcast type of stream (like internet radio streams and whatnot). 01210 */ 01211 DRFLAC_API drflac_int32* drflac_open_and_read_pcm_frames_s32(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks); 01212 01213 /* Same as drflac_open_and_read_pcm_frames_s32(), except returns signed 16-bit integer samples. */ 01214 DRFLAC_API drflac_int16* drflac_open_and_read_pcm_frames_s16(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks); 01215 01216 /* Same as drflac_open_and_read_pcm_frames_s32(), except returns 32-bit floating-point samples. */ 01217 DRFLAC_API float* drflac_open_and_read_pcm_frames_f32(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks); 01218 01219 #ifndef DR_FLAC_NO_STDIO 01220 /* Same as drflac_open_and_read_pcm_frames_s32() except opens the decoder from a file. */ 01221 DRFLAC_API drflac_int32* drflac_open_file_and_read_pcm_frames_s32(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks); 01222 01223 /* Same as drflac_open_file_and_read_pcm_frames_s32(), except returns signed 16-bit integer samples. */ 01224 DRFLAC_API drflac_int16* drflac_open_file_and_read_pcm_frames_s16(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks); 01225 01226 /* Same as drflac_open_file_and_read_pcm_frames_s32(), except returns 32-bit floating-point samples. */ 01227 DRFLAC_API float* drflac_open_file_and_read_pcm_frames_f32(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks); 01228 #endif 01229 01230 /* Same as drflac_open_and_read_pcm_frames_s32() except opens the decoder from a block of memory. */ 01231 DRFLAC_API drflac_int32* drflac_open_memory_and_read_pcm_frames_s32(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks); 01232 01233 /* Same as drflac_open_memory_and_read_pcm_frames_s32(), except returns signed 16-bit integer samples. */ 01234 DRFLAC_API drflac_int16* drflac_open_memory_and_read_pcm_frames_s16(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks); 01235 01236 /* Same as drflac_open_memory_and_read_pcm_frames_s32(), except returns 32-bit floating-point samples. */ 01237 DRFLAC_API float* drflac_open_memory_and_read_pcm_frames_f32(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks); 01238 01239 /* 01240 Frees memory that was allocated internally by dr_flac. 01241 01242 Set pAllocationCallbacks to the same object that was passed to drflac_open_*_and_read_pcm_frames_*(). If you originally passed in NULL, pass in NULL for this. 01243 */ 01244 DRFLAC_API void drflac_free(void* p, const drflac_allocation_callbacks* pAllocationCallbacks); 01245 01246 01247 /* Structure representing an iterator for vorbis comments in a VORBIS_COMMENT metadata block. */ 01248 typedef struct 01249 { 01250 drflac_uint32 countRemaining; 01251 const char* pRunningData; 01252 } drflac_vorbis_comment_iterator; 01253 01254 /* 01255 Initializes a vorbis comment iterator. This can be used for iterating over the vorbis comments in a VORBIS_COMMENT 01256 metadata block. 01257 */ 01258 DRFLAC_API void drflac_init_vorbis_comment_iterator(drflac_vorbis_comment_iterator* pIter, drflac_uint32 commentCount, const void* pComments); 01259 01260 /* 01261 Goes to the next vorbis comment in the given iterator. If null is returned it means there are no more comments. The 01262 returned string is NOT null terminated. 01263 */ 01264 DRFLAC_API const char* drflac_next_vorbis_comment(drflac_vorbis_comment_iterator* pIter, drflac_uint32* pCommentLengthOut); 01265 01266 01267 /* Structure representing an iterator for cuesheet tracks in a CUESHEET metadata block. */ 01268 typedef struct 01269 { 01270 drflac_uint32 countRemaining; 01271 const char* pRunningData; 01272 } drflac_cuesheet_track_iterator; 01273 01274 /* Packing is important on this structure because we map this directly to the raw data within the CUESHEET metadata block. */ 01275 #pragma pack(4) 01276 typedef struct 01277 { 01278 drflac_uint64 offset; 01279 drflac_uint8 index; 01280 drflac_uint8 reserved[3]; 01281 } drflac_cuesheet_track_index; 01282 #pragma pack() 01283 01284 typedef struct 01285 { 01286 drflac_uint64 offset; 01287 drflac_uint8 trackNumber; 01288 char ISRC[12]; 01289 drflac_bool8 isAudio; 01290 drflac_bool8 preEmphasis; 01291 drflac_uint8 indexCount; 01292 const drflac_cuesheet_track_index* pIndexPoints; 01293 } drflac_cuesheet_track; 01294 01295 /* 01296 Initializes a cuesheet track iterator. This can be used for iterating over the cuesheet tracks in a CUESHEET metadata 01297 block. 01298 */ 01299 DRFLAC_API void drflac_init_cuesheet_track_iterator(drflac_cuesheet_track_iterator* pIter, drflac_uint32 trackCount, const void* pTrackData); 01300 01301 /* Goes to the next cuesheet track in the given iterator. If DRFLAC_FALSE is returned it means there are no more comments. */ 01302 DRFLAC_API drflac_bool32 drflac_next_cuesheet_track(drflac_cuesheet_track_iterator* pIter, drflac_cuesheet_track* pCuesheetTrack); 01303 01304 01305 #ifdef __cplusplus 01306 } 01307 #endif 01308 #endif /* dr_flac_h */ 01309 01310 01311 /************************************************************************************************************************************************************ 01312 ************************************************************************************************************************************************************ 01313 01314 IMPLEMENTATION 01315 01316 ************************************************************************************************************************************************************ 01317 ************************************************************************************************************************************************************/ 01318 #if defined(DR_FLAC_IMPLEMENTATION) || defined(DRFLAC_IMPLEMENTATION) 01319 01320 /* Disable some annoying warnings. */ 01321 #if defined(__GNUC__) 01322 #pragma GCC diagnostic push 01323 #if __GNUC__ >= 7 01324 #pragma GCC diagnostic ignored "-Wimplicit-fallthrough" 01325 #endif 01326 #endif 01327 01328 #ifdef __linux__ 01329 #ifndef _BSD_SOURCE 01330 #define _BSD_SOURCE 01331 #endif 01332 #ifndef __USE_BSD 01333 #define __USE_BSD 01334 #endif 01335 #include <endian.h> 01336 #endif 01337 01338 #include <stdlib.h> 01339 #include <string.h> 01340 01341 #ifdef _MSC_VER 01342 #define DRFLAC_INLINE __forceinline 01343 #elif defined(__GNUC__) 01344 /* 01345 I've had a bug report where GCC is emitting warnings about functions possibly not being inlineable. This warning happens when 01346 the __attribute__((always_inline)) attribute is defined without an "inline" statement. I think therefore there must be some 01347 case where "__inline__" is not always defined, thus the compiler emitting these warnings. When using -std=c89 or -ansi on the 01348 command line, we cannot use the "inline" keyword and instead need to use "__inline__". In an attempt to work around this issue 01349 I am using "__inline__" only when we're compiling in strict ANSI mode. 01350 */ 01351 #if defined(__STRICT_ANSI__) 01352 #define DRFLAC_INLINE __inline__ __attribute__((always_inline)) 01353 #else 01354 #define DRFLAC_INLINE inline __attribute__((always_inline)) 01355 #endif 01356 #else 01357 #define DRFLAC_INLINE 01358 #endif 01359 01360 /* CPU architecture. */ 01361 #if defined(__x86_64__) || defined(_M_X64) 01362 #define DRFLAC_X64 01363 #elif defined(__i386) || defined(_M_IX86) 01364 #define DRFLAC_X86 01365 #elif defined(__arm__) || defined(_M_ARM) 01366 #define DRFLAC_ARM 01367 #endif 01368 01369 /* Intrinsics Support */ 01370 #if !defined(DR_FLAC_NO_SIMD) 01371 #if defined(DRFLAC_X64) || defined(DRFLAC_X86) 01372 #if defined(_MSC_VER) && !defined(__clang__) 01373 /* MSVC. */ 01374 #if _MSC_VER >= 1400 && !defined(DRFLAC_NO_SSE2) /* 2005 */ 01375 #define DRFLAC_SUPPORT_SSE2 01376 #endif 01377 #if _MSC_VER >= 1600 && !defined(DRFLAC_NO_SSE41) /* 2010 */ 01378 #define DRFLAC_SUPPORT_SSE41 01379 #endif 01380 #else 01381 /* Assume GNUC-style. */ 01382 #if defined(__SSE2__) && !defined(DRFLAC_NO_SSE2) 01383 #define DRFLAC_SUPPORT_SSE2 01384 #endif 01385 #if defined(__SSE4_1__) && !defined(DRFLAC_NO_SSE41) 01386 #define DRFLAC_SUPPORT_SSE41 01387 #endif 01388 #endif 01389 01390 /* If at this point we still haven't determined compiler support for the intrinsics just fall back to __has_include. */ 01391 #if !defined(__GNUC__) && !defined(__clang__) && defined(__has_include) 01392 #if !defined(DRFLAC_SUPPORT_SSE2) && !defined(DRFLAC_NO_SSE2) && __has_include(<emmintrin.h>) 01393 #define DRFLAC_SUPPORT_SSE2 01394 #endif 01395 #if !defined(DRFLAC_SUPPORT_SSE41) && !defined(DRFLAC_NO_SSE41) && __has_include(<smmintrin.h>) 01396 #define DRFLAC_SUPPORT_SSE41 01397 #endif 01398 #endif 01399 01400 #if defined(DRFLAC_SUPPORT_SSE41) 01401 #include <smmintrin.h> 01402 #elif defined(DRFLAC_SUPPORT_SSE2) 01403 #include <emmintrin.h> 01404 #endif 01405 #endif 01406 01407 #if defined(DRFLAC_ARM) 01408 #if !defined(DRFLAC_NO_NEON) && (defined(__ARM_NEON) || defined(__aarch64__) || defined(_M_ARM64)) 01409 #define DRFLAC_SUPPORT_NEON 01410 #endif 01411 01412 /* Fall back to looking for the #include file. */ 01413 #if !defined(__GNUC__) && !defined(__clang__) && defined(__has_include) 01414 #if !defined(DRFLAC_SUPPORT_NEON) && !defined(DRFLAC_NO_NEON) && __has_include(<arm_neon.h>) 01415 #define DRFLAC_SUPPORT_NEON 01416 #endif 01417 #endif 01418 01419 #if defined(DRFLAC_SUPPORT_NEON) 01420 #include <arm_neon.h> 01421 #endif 01422 #endif 01423 #endif 01424 01425 /* Compile-time CPU feature support. */ 01426 #if !defined(DR_FLAC_NO_SIMD) && (defined(DRFLAC_X86) || defined(DRFLAC_X64)) 01427 #if defined(_MSC_VER) && !defined(__clang__) 01428 #if _MSC_VER >= 1400 01429 #include <intrin.h> 01430 static void drflac__cpuid(int info[4], int fid) 01431 { 01432 __cpuid(info, fid); 01433 } 01434 #else 01435 #define DRFLAC_NO_CPUID 01436 #endif 01437 #else 01438 #if defined(__GNUC__) || defined(__clang__) 01439 static void drflac__cpuid(int info[4], int fid) 01440 { 01441 /* 01442 It looks like the -fPIC option uses the ebx register which GCC complains about. We can work around this by just using a different register, the 01443 specific register of which I'm letting the compiler decide on. The "k" prefix is used to specify a 32-bit register. The {...} syntax is for 01444 supporting different assembly dialects. 01445 01446 What's basically happening is that we're saving and restoring the ebx register manually. 01447 */ 01448 #if defined(DRFLAC_X86) && defined(__PIC__) 01449 __asm__ __volatile__ ( 01450 "xchg{l} {%%}ebx, %k1;" 01451 "cpuid;" 01452 "xchg{l} {%%}ebx, %k1;" 01453 : "=a"(info[0]), "=&r"(info[1]), "=c"(info[2]), "=d"(info[3]) : "a"(fid), "c"(0) 01454 ); 01455 #else 01456 __asm__ __volatile__ ( 01457 "cpuid" : "=a"(info[0]), "=b"(info[1]), "=c"(info[2]), "=d"(info[3]) : "a"(fid), "c"(0) 01458 ); 01459 #endif 01460 } 01461 #else 01462 #define DRFLAC_NO_CPUID 01463 #endif 01464 #endif 01465 #else 01466 #define DRFLAC_NO_CPUID 01467 #endif 01468 01469 static DRFLAC_INLINE drflac_bool32 drflac_has_sse2(void) 01470 { 01471 #if defined(DRFLAC_SUPPORT_SSE2) 01472 #if (defined(DRFLAC_X64) || defined(DRFLAC_X86)) && !defined(DRFLAC_NO_SSE2) 01473 #if defined(DRFLAC_X64) 01474 return DRFLAC_TRUE; /* 64-bit targets always support SSE2. */ 01475 #elif (defined(_M_IX86_FP) && _M_IX86_FP == 2) || defined(__SSE2__) 01476 return DRFLAC_TRUE; /* If the compiler is allowed to freely generate SSE2 code we can assume support. */ 01477 #else 01478 #if defined(DRFLAC_NO_CPUID) 01479 return DRFLAC_FALSE; 01480 #else 01481 int info[4]; 01482 drflac__cpuid(info, 1); 01483 return (info[3] & (1 << 26)) != 0; 01484 #endif 01485 #endif 01486 #else 01487 return DRFLAC_FALSE; /* SSE2 is only supported on x86 and x64 architectures. */ 01488 #endif 01489 #else 01490 return DRFLAC_FALSE; /* No compiler support. */ 01491 #endif 01492 } 01493 01494 static DRFLAC_INLINE drflac_bool32 drflac_has_sse41(void) 01495 { 01496 #if defined(DRFLAC_SUPPORT_SSE41) 01497 #if (defined(DRFLAC_X64) || defined(DRFLAC_X86)) && !defined(DRFLAC_NO_SSE41) 01498 #if defined(DRFLAC_X64) 01499 return DRFLAC_TRUE; /* 64-bit targets always support SSE4.1. */ 01500 #elif (defined(_M_IX86_FP) && _M_IX86_FP == 2) || defined(__SSE4_1__) 01501 return DRFLAC_TRUE; /* If the compiler is allowed to freely generate SSE41 code we can assume support. */ 01502 #else 01503 #if defined(DRFLAC_NO_CPUID) 01504 return DRFLAC_FALSE; 01505 #else 01506 int info[4]; 01507 drflac__cpuid(info, 1); 01508 return (info[2] & (1 << 19)) != 0; 01509 #endif 01510 #endif 01511 #else 01512 return DRFLAC_FALSE; /* SSE41 is only supported on x86 and x64 architectures. */ 01513 #endif 01514 #else 01515 return DRFLAC_FALSE; /* No compiler support. */ 01516 #endif 01517 } 01518 01519 01520 #if defined(_MSC_VER) && _MSC_VER >= 1500 && (defined(DRFLAC_X86) || defined(DRFLAC_X64)) 01521 #define DRFLAC_HAS_LZCNT_INTRINSIC 01522 #elif (defined(__GNUC__) && ((__GNUC__ > 4) || (__GNUC__ == 4 && __GNUC_MINOR__ >= 7))) 01523 #define DRFLAC_HAS_LZCNT_INTRINSIC 01524 #elif defined(__clang__) 01525 #if defined(__has_builtin) 01526 #if __has_builtin(__builtin_clzll) || __has_builtin(__builtin_clzl) 01527 #define DRFLAC_HAS_LZCNT_INTRINSIC 01528 #endif 01529 #endif 01530 #endif 01531 01532 #if defined(_MSC_VER) && _MSC_VER >= 1400 01533 #define DRFLAC_HAS_BYTESWAP16_INTRINSIC 01534 #define DRFLAC_HAS_BYTESWAP32_INTRINSIC 01535 #define DRFLAC_HAS_BYTESWAP64_INTRINSIC 01536 #elif defined(__clang__) 01537 #if defined(__has_builtin) 01538 #if __has_builtin(__builtin_bswap16) 01539 #define DRFLAC_HAS_BYTESWAP16_INTRINSIC 01540 #endif 01541 #if __has_builtin(__builtin_bswap32) 01542 #define DRFLAC_HAS_BYTESWAP32_INTRINSIC 01543 #endif 01544 #if __has_builtin(__builtin_bswap64) 01545 #define DRFLAC_HAS_BYTESWAP64_INTRINSIC 01546 #endif 01547 #endif 01548 #elif defined(__GNUC__) 01549 #if ((__GNUC__ > 4) || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3)) 01550 #define DRFLAC_HAS_BYTESWAP32_INTRINSIC 01551 #define DRFLAC_HAS_BYTESWAP64_INTRINSIC 01552 #endif 01553 #if ((__GNUC__ > 4) || (__GNUC__ == 4 && __GNUC_MINOR__ >= 8)) 01554 #define DRFLAC_HAS_BYTESWAP16_INTRINSIC 01555 #endif 01556 #endif 01557 01558 01559 /* Standard library stuff. */ 01560 #ifndef DRFLAC_ASSERT 01561 #include <assert.h> 01562 #define DRFLAC_ASSERT(expression) assert(expression) 01563 #endif 01564 #ifndef DRFLAC_MALLOC 01565 #define DRFLAC_MALLOC(sz) malloc((sz)) 01566 #endif 01567 #ifndef DRFLAC_REALLOC 01568 #define DRFLAC_REALLOC(p, sz) realloc((p), (sz)) 01569 #endif 01570 #ifndef DRFLAC_FREE 01571 #define DRFLAC_FREE(p) free((p)) 01572 #endif 01573 #ifndef DRFLAC_COPY_MEMORY 01574 #define DRFLAC_COPY_MEMORY(dst, src, sz) memcpy((dst), (src), (sz)) 01575 #endif 01576 #ifndef DRFLAC_ZERO_MEMORY 01577 #define DRFLAC_ZERO_MEMORY(p, sz) memset((p), 0, (sz)) 01578 #endif 01579 #ifndef DRFLAC_ZERO_OBJECT 01580 #define DRFLAC_ZERO_OBJECT(p) DRFLAC_ZERO_MEMORY((p), sizeof(*(p))) 01581 #endif 01582 01583 #define DRFLAC_MAX_SIMD_VECTOR_SIZE 64 /* 64 for AVX-512 in the future. */ 01584 01585 typedef drflac_int32 drflac_result; 01586 #define DRFLAC_SUCCESS 0 01587 #define DRFLAC_ERROR -1 /* A generic error. */ 01588 #define DRFLAC_INVALID_ARGS -2 01589 #define DRFLAC_INVALID_OPERATION -3 01590 #define DRFLAC_OUT_OF_MEMORY -4 01591 #define DRFLAC_OUT_OF_RANGE -5 01592 #define DRFLAC_ACCESS_DENIED -6 01593 #define DRFLAC_DOES_NOT_EXIST -7 01594 #define DRFLAC_ALREADY_EXISTS -8 01595 #define DRFLAC_TOO_MANY_OPEN_FILES -9 01596 #define DRFLAC_INVALID_FILE -10 01597 #define DRFLAC_TOO_BIG -11 01598 #define DRFLAC_PATH_TOO_LONG -12 01599 #define DRFLAC_NAME_TOO_LONG -13 01600 #define DRFLAC_NOT_DIRECTORY -14 01601 #define DRFLAC_IS_DIRECTORY -15 01602 #define DRFLAC_DIRECTORY_NOT_EMPTY -16 01603 #define DRFLAC_END_OF_FILE -17 01604 #define DRFLAC_NO_SPACE -18 01605 #define DRFLAC_BUSY -19 01606 #define DRFLAC_IO_ERROR -20 01607 #define DRFLAC_INTERRUPT -21 01608 #define DRFLAC_UNAVAILABLE -22 01609 #define DRFLAC_ALREADY_IN_USE -23 01610 #define DRFLAC_BAD_ADDRESS -24 01611 #define DRFLAC_BAD_SEEK -25 01612 #define DRFLAC_BAD_PIPE -26 01613 #define DRFLAC_DEADLOCK -27 01614 #define DRFLAC_TOO_MANY_LINKS -28 01615 #define DRFLAC_NOT_IMPLEMENTED -29 01616 #define DRFLAC_NO_MESSAGE -30 01617 #define DRFLAC_BAD_MESSAGE -31 01618 #define DRFLAC_NO_DATA_AVAILABLE -32 01619 #define DRFLAC_INVALID_DATA -33 01620 #define DRFLAC_TIMEOUT -34 01621 #define DRFLAC_NO_NETWORK -35 01622 #define DRFLAC_NOT_UNIQUE -36 01623 #define DRFLAC_NOT_SOCKET -37 01624 #define DRFLAC_NO_ADDRESS -38 01625 #define DRFLAC_BAD_PROTOCOL -39 01626 #define DRFLAC_PROTOCOL_UNAVAILABLE -40 01627 #define DRFLAC_PROTOCOL_NOT_SUPPORTED -41 01628 #define DRFLAC_PROTOCOL_FAMILY_NOT_SUPPORTED -42 01629 #define DRFLAC_ADDRESS_FAMILY_NOT_SUPPORTED -43 01630 #define DRFLAC_SOCKET_NOT_SUPPORTED -44 01631 #define DRFLAC_CONNECTION_RESET -45 01632 #define DRFLAC_ALREADY_CONNECTED -46 01633 #define DRFLAC_NOT_CONNECTED -47 01634 #define DRFLAC_CONNECTION_REFUSED -48 01635 #define DRFLAC_NO_HOST -49 01636 #define DRFLAC_IN_PROGRESS -50 01637 #define DRFLAC_CANCELLED -51 01638 #define DRFLAC_MEMORY_ALREADY_MAPPED -52 01639 #define DRFLAC_AT_END -53 01640 #define DRFLAC_CRC_MISMATCH -128 01641 01642 #define DRFLAC_SUBFRAME_CONSTANT 0 01643 #define DRFLAC_SUBFRAME_VERBATIM 1 01644 #define DRFLAC_SUBFRAME_FIXED 8 01645 #define DRFLAC_SUBFRAME_LPC 32 01646 #define DRFLAC_SUBFRAME_RESERVED 255 01647 01648 #define DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE 0 01649 #define DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2 1 01650 01651 #define DRFLAC_CHANNEL_ASSIGNMENT_INDEPENDENT 0 01652 #define DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE 8 01653 #define DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE 9 01654 #define DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE 10 01655 01656 #define drflac_align(x, a) ((((x) + (a) - 1) / (a)) * (a)) 01657 01658 01659 DRFLAC_API void drflac_version(drflac_uint32* pMajor, drflac_uint32* pMinor, drflac_uint32* pRevision) 01660 { 01661 if (pMajor) { 01662 *pMajor = DRFLAC_VERSION_MAJOR; 01663 } 01664 01665 if (pMinor) { 01666 *pMinor = DRFLAC_VERSION_MINOR; 01667 } 01668 01669 if (pRevision) { 01670 *pRevision = DRFLAC_VERSION_REVISION; 01671 } 01672 } 01673 01674 DRFLAC_API const char* drflac_version_string() 01675 { 01676 return DRFLAC_VERSION_STRING; 01677 } 01678 01679 01680 /* CPU caps. */ 01681 #if defined(__has_feature) 01682 #if __has_feature(thread_sanitizer) 01683 #define DRFLAC_NO_THREAD_SANITIZE __attribute__((no_sanitize("thread"))) 01684 #else 01685 #define DRFLAC_NO_THREAD_SANITIZE 01686 #endif 01687 #else 01688 #define DRFLAC_NO_THREAD_SANITIZE 01689 #endif 01690 01691 #if defined(DRFLAC_HAS_LZCNT_INTRINSIC) 01692 static drflac_bool32 drflac__gIsLZCNTSupported = DRFLAC_FALSE; 01693 #endif 01694 01695 #ifndef DRFLAC_NO_CPUID 01696 static drflac_bool32 drflac__gIsSSE2Supported = DRFLAC_FALSE; 01697 static drflac_bool32 drflac__gIsSSE41Supported = DRFLAC_FALSE; 01698 01699 /* 01700 I've had a bug report that Clang's ThreadSanitizer presents a warning in this function. Having reviewed this, this does 01701 actually make sense. However, since CPU caps should never differ for a running process, I don't think the trade off of 01702 complicating internal API's by passing around CPU caps versus just disabling the warnings is worthwhile. I'm therefore 01703 just going to disable these warnings. This is disabled via the DRFLAC_NO_THREAD_SANITIZE attribute. 01704 */ 01705 DRFLAC_NO_THREAD_SANITIZE static void drflac__init_cpu_caps(void) 01706 { 01707 static drflac_bool32 isCPUCapsInitialized = DRFLAC_FALSE; 01708 01709 if (!isCPUCapsInitialized) { 01710 /* LZCNT */ 01711 #if defined(DRFLAC_HAS_LZCNT_INTRINSIC) 01712 int info[4] = {0}; 01713 drflac__cpuid(info, 0x80000001); 01714 drflac__gIsLZCNTSupported = (info[2] & (1 << 5)) != 0; 01715 #endif 01716 01717 /* SSE2 */ 01718 drflac__gIsSSE2Supported = drflac_has_sse2(); 01719 01720 /* SSE4.1 */ 01721 drflac__gIsSSE41Supported = drflac_has_sse41(); 01722 01723 /* Initialized. */ 01724 isCPUCapsInitialized = DRFLAC_TRUE; 01725 } 01726 } 01727 #else 01728 static drflac_bool32 drflac__gIsNEONSupported = DRFLAC_FALSE; 01729 01730 static DRFLAC_INLINE drflac_bool32 drflac__has_neon(void) 01731 { 01732 #if defined(DRFLAC_SUPPORT_NEON) 01733 #if defined(DRFLAC_ARM) && !defined(DRFLAC_NO_NEON) 01734 #if (defined(__ARM_NEON) || defined(__aarch64__) || defined(_M_ARM64)) 01735 return DRFLAC_TRUE; /* If the compiler is allowed to freely generate NEON code we can assume support. */ 01736 #else 01737 /* TODO: Runtime check. */ 01738 return DRFLAC_FALSE; 01739 #endif 01740 #else 01741 return DRFLAC_FALSE; /* NEON is only supported on ARM architectures. */ 01742 #endif 01743 #else 01744 return DRFLAC_FALSE; /* No compiler support. */ 01745 #endif 01746 } 01747 01748 DRFLAC_NO_THREAD_SANITIZE static void drflac__init_cpu_caps(void) 01749 { 01750 drflac__gIsNEONSupported = drflac__has_neon(); 01751 01752 #if defined(DRFLAC_HAS_LZCNT_INTRINSIC) && defined(DRFLAC_ARM) && (defined(__ARM_ARCH) && __ARM_ARCH >= 5) 01753 drflac__gIsLZCNTSupported = DRFLAC_TRUE; 01754 #endif 01755 } 01756 #endif 01757 01758 01759 /* Endian Management */ 01760 static DRFLAC_INLINE drflac_bool32 drflac__is_little_endian(void) 01761 { 01762 #if defined(DRFLAC_X86) || defined(DRFLAC_X64) 01763 return DRFLAC_TRUE; 01764 #elif defined(__BYTE_ORDER) && defined(__LITTLE_ENDIAN) && __BYTE_ORDER == __LITTLE_ENDIAN 01765 return DRFLAC_TRUE; 01766 #else 01767 int n = 1; 01768 return (*(char*)&n) == 1; 01769 #endif 01770 } 01771 01772 static DRFLAC_INLINE drflac_uint16 drflac__swap_endian_uint16(drflac_uint16 n) 01773 { 01774 #ifdef DRFLAC_HAS_BYTESWAP16_INTRINSIC 01775 #if defined(_MSC_VER) 01776 return _byteswap_ushort(n); 01777 #elif defined(__GNUC__) || defined(__clang__) 01778 return __builtin_bswap16(n); 01779 #else 01780 #error "This compiler does not support the byte swap intrinsic." 01781 #endif 01782 #else 01783 return ((n & 0xFF00) >> 8) | 01784 ((n & 0x00FF) << 8); 01785 #endif 01786 } 01787 01788 static DRFLAC_INLINE drflac_uint32 drflac__swap_endian_uint32(drflac_uint32 n) 01789 { 01790 #ifdef DRFLAC_HAS_BYTESWAP32_INTRINSIC 01791 #if defined(_MSC_VER) 01792 return _byteswap_ulong(n); 01793 #elif defined(__GNUC__) || defined(__clang__) 01794 #if defined(DRFLAC_ARM) && (defined(__ARM_ARCH) && __ARM_ARCH >= 6) && !defined(DRFLAC_64BIT) /* <-- 64-bit inline assembly has not been tested, so disabling for now. */ 01795 /* Inline assembly optimized implementation for ARM. In my testing, GCC does not generate optimized code with __builtin_bswap32(). */ 01796 drflac_uint32 r; 01797 __asm__ __volatile__ ( 01798 #if defined(DRFLAC_64BIT) 01799 "rev %w[out], %w[in]" : [out]"=r"(r) : [in]"r"(n) /* <-- This is untested. If someone in the community could test this, that would be appreciated! */ 01800 #else 01801 "rev %[out], %[in]" : [out]"=r"(r) : [in]"r"(n) 01802 #endif 01803 ); 01804 return r; 01805 #else 01806 return __builtin_bswap32(n); 01807 #endif 01808 #else 01809 #error "This compiler does not support the byte swap intrinsic." 01810 #endif 01811 #else 01812 return ((n & 0xFF000000) >> 24) | 01813 ((n & 0x00FF0000) >> 8) | 01814 ((n & 0x0000FF00) << 8) | 01815 ((n & 0x000000FF) << 24); 01816 #endif 01817 } 01818 01819 static DRFLAC_INLINE drflac_uint64 drflac__swap_endian_uint64(drflac_uint64 n) 01820 { 01821 #ifdef DRFLAC_HAS_BYTESWAP64_INTRINSIC 01822 #if defined(_MSC_VER) 01823 return _byteswap_uint64(n); 01824 #elif defined(__GNUC__) || defined(__clang__) 01825 return __builtin_bswap64(n); 01826 #else 01827 #error "This compiler does not support the byte swap intrinsic." 01828 #endif 01829 #else 01830 return ((n & (drflac_uint64)0xFF00000000000000) >> 56) | 01831 ((n & (drflac_uint64)0x00FF000000000000) >> 40) | 01832 ((n & (drflac_uint64)0x0000FF0000000000) >> 24) | 01833 ((n & (drflac_uint64)0x000000FF00000000) >> 8) | 01834 ((n & (drflac_uint64)0x00000000FF000000) << 8) | 01835 ((n & (drflac_uint64)0x0000000000FF0000) << 24) | 01836 ((n & (drflac_uint64)0x000000000000FF00) << 40) | 01837 ((n & (drflac_uint64)0x00000000000000FF) << 56); 01838 #endif 01839 } 01840 01841 01842 static DRFLAC_INLINE drflac_uint16 drflac__be2host_16(drflac_uint16 n) 01843 { 01844 if (drflac__is_little_endian()) { 01845 return drflac__swap_endian_uint16(n); 01846 } 01847 01848 return n; 01849 } 01850 01851 static DRFLAC_INLINE drflac_uint32 drflac__be2host_32(drflac_uint32 n) 01852 { 01853 if (drflac__is_little_endian()) { 01854 return drflac__swap_endian_uint32(n); 01855 } 01856 01857 return n; 01858 } 01859 01860 static DRFLAC_INLINE drflac_uint64 drflac__be2host_64(drflac_uint64 n) 01861 { 01862 if (drflac__is_little_endian()) { 01863 return drflac__swap_endian_uint64(n); 01864 } 01865 01866 return n; 01867 } 01868 01869 01870 static DRFLAC_INLINE drflac_uint32 drflac__le2host_32(drflac_uint32 n) 01871 { 01872 if (!drflac__is_little_endian()) { 01873 return drflac__swap_endian_uint32(n); 01874 } 01875 01876 return n; 01877 } 01878 01879 01880 static DRFLAC_INLINE drflac_uint32 drflac__unsynchsafe_32(drflac_uint32 n) 01881 { 01882 drflac_uint32 result = 0; 01883 result |= (n & 0x7F000000) >> 3; 01884 result |= (n & 0x007F0000) >> 2; 01885 result |= (n & 0x00007F00) >> 1; 01886 result |= (n & 0x0000007F) >> 0; 01887 01888 return result; 01889 } 01890 01891 01892 01893 /* The CRC code below is based on this document: http://zlib.net/crc_v3.txt */ 01894 static drflac_uint8 drflac__crc8_table[] = { 01895 0x00, 0x07, 0x0E, 0x09, 0x1C, 0x1B, 0x12, 0x15, 0x38, 0x3F, 0x36, 0x31, 0x24, 0x23, 0x2A, 0x2D, 01896 0x70, 0x77, 0x7E, 0x79, 0x6C, 0x6B, 0x62, 0x65, 0x48, 0x4F, 0x46, 0x41, 0x54, 0x53, 0x5A, 0x5D, 01897 0xE0, 0xE7, 0xEE, 0xE9, 0xFC, 0xFB, 0xF2, 0xF5, 0xD8, 0xDF, 0xD6, 0xD1, 0xC4, 0xC3, 0xCA, 0xCD, 01898 0x90, 0x97, 0x9E, 0x99, 0x8C, 0x8B, 0x82, 0x85, 0xA8, 0xAF, 0xA6, 0xA1, 0xB4, 0xB3, 0xBA, 0xBD, 01899 0xC7, 0xC0, 0xC9, 0xCE, 0xDB, 0xDC, 0xD5, 0xD2, 0xFF, 0xF8, 0xF1, 0xF6, 0xE3, 0xE4, 0xED, 0xEA, 01900 0xB7, 0xB0, 0xB9, 0xBE, 0xAB, 0xAC, 0xA5, 0xA2, 0x8F, 0x88, 0x81, 0x86, 0x93, 0x94, 0x9D, 0x9A, 01901 0x27, 0x20, 0x29, 0x2E, 0x3B, 0x3C, 0x35, 0x32, 0x1F, 0x18, 0x11, 0x16, 0x03, 0x04, 0x0D, 0x0A, 01902 0x57, 0x50, 0x59, 0x5E, 0x4B, 0x4C, 0x45, 0x42, 0x6F, 0x68, 0x61, 0x66, 0x73, 0x74, 0x7D, 0x7A, 01903 0x89, 0x8E, 0x87, 0x80, 0x95, 0x92, 0x9B, 0x9C, 0xB1, 0xB6, 0xBF, 0xB8, 0xAD, 0xAA, 0xA3, 0xA4, 01904 0xF9, 0xFE, 0xF7, 0xF0, 0xE5, 0xE2, 0xEB, 0xEC, 0xC1, 0xC6, 0xCF, 0xC8, 0xDD, 0xDA, 0xD3, 0xD4, 01905 0x69, 0x6E, 0x67, 0x60, 0x75, 0x72, 0x7B, 0x7C, 0x51, 0x56, 0x5F, 0x58, 0x4D, 0x4A, 0x43, 0x44, 01906 0x19, 0x1E, 0x17, 0x10, 0x05, 0x02, 0x0B, 0x0C, 0x21, 0x26, 0x2F, 0x28, 0x3D, 0x3A, 0x33, 0x34, 01907 0x4E, 0x49, 0x40, 0x47, 0x52, 0x55, 0x5C, 0x5B, 0x76, 0x71, 0x78, 0x7F, 0x6A, 0x6D, 0x64, 0x63, 01908 0x3E, 0x39, 0x30, 0x37, 0x22, 0x25, 0x2C, 0x2B, 0x06, 0x01, 0x08, 0x0F, 0x1A, 0x1D, 0x14, 0x13, 01909 0xAE, 0xA9, 0xA0, 0xA7, 0xB2, 0xB5, 0xBC, 0xBB, 0x96, 0x91, 0x98, 0x9F, 0x8A, 0x8D, 0x84, 0x83, 01910 0xDE, 0xD9, 0xD0, 0xD7, 0xC2, 0xC5, 0xCC, 0xCB, 0xE6, 0xE1, 0xE8, 0xEF, 0xFA, 0xFD, 0xF4, 0xF3 01911 }; 01912 01913 static drflac_uint16 drflac__crc16_table[] = { 01914 0x0000, 0x8005, 0x800F, 0x000A, 0x801B, 0x001E, 0x0014, 0x8011, 01915 0x8033, 0x0036, 0x003C, 0x8039, 0x0028, 0x802D, 0x8027, 0x0022, 01916 0x8063, 0x0066, 0x006C, 0x8069, 0x0078, 0x807D, 0x8077, 0x0072, 01917 0x0050, 0x8055, 0x805F, 0x005A, 0x804B, 0x004E, 0x0044, 0x8041, 01918 0x80C3, 0x00C6, 0x00CC, 0x80C9, 0x00D8, 0x80DD, 0x80D7, 0x00D2, 01919 0x00F0, 0x80F5, 0x80FF, 0x00FA, 0x80EB, 0x00EE, 0x00E4, 0x80E1, 01920 0x00A0, 0x80A5, 0x80AF, 0x00AA, 0x80BB, 0x00BE, 0x00B4, 0x80B1, 01921 0x8093, 0x0096, 0x009C, 0x8099, 0x0088, 0x808D, 0x8087, 0x0082, 01922 0x8183, 0x0186, 0x018C, 0x8189, 0x0198, 0x819D, 0x8197, 0x0192, 01923 0x01B0, 0x81B5, 0x81BF, 0x01BA, 0x81AB, 0x01AE, 0x01A4, 0x81A1, 01924 0x01E0, 0x81E5, 0x81EF, 0x01EA, 0x81FB, 0x01FE, 0x01F4, 0x81F1, 01925 0x81D3, 0x01D6, 0x01DC, 0x81D9, 0x01C8, 0x81CD, 0x81C7, 0x01C2, 01926 0x0140, 0x8145, 0x814F, 0x014A, 0x815B, 0x015E, 0x0154, 0x8151, 01927 0x8173, 0x0176, 0x017C, 0x8179, 0x0168, 0x816D, 0x8167, 0x0162, 01928 0x8123, 0x0126, 0x012C, 0x8129, 0x0138, 0x813D, 0x8137, 0x0132, 01929 0x0110, 0x8115, 0x811F, 0x011A, 0x810B, 0x010E, 0x0104, 0x8101, 01930 0x8303, 0x0306, 0x030C, 0x8309, 0x0318, 0x831D, 0x8317, 0x0312, 01931 0x0330, 0x8335, 0x833F, 0x033A, 0x832B, 0x032E, 0x0324, 0x8321, 01932 0x0360, 0x8365, 0x836F, 0x036A, 0x837B, 0x037E, 0x0374, 0x8371, 01933 0x8353, 0x0356, 0x035C, 0x8359, 0x0348, 0x834D, 0x8347, 0x0342, 01934 0x03C0, 0x83C5, 0x83CF, 0x03CA, 0x83DB, 0x03DE, 0x03D4, 0x83D1, 01935 0x83F3, 0x03F6, 0x03FC, 0x83F9, 0x03E8, 0x83ED, 0x83E7, 0x03E2, 01936 0x83A3, 0x03A6, 0x03AC, 0x83A9, 0x03B8, 0x83BD, 0x83B7, 0x03B2, 01937 0x0390, 0x8395, 0x839F, 0x039A, 0x838B, 0x038E, 0x0384, 0x8381, 01938 0x0280, 0x8285, 0x828F, 0x028A, 0x829B, 0x029E, 0x0294, 0x8291, 01939 0x82B3, 0x02B6, 0x02BC, 0x82B9, 0x02A8, 0x82AD, 0x82A7, 0x02A2, 01940 0x82E3, 0x02E6, 0x02EC, 0x82E9, 0x02F8, 0x82FD, 0x82F7, 0x02F2, 01941 0x02D0, 0x82D5, 0x82DF, 0x02DA, 0x82CB, 0x02CE, 0x02C4, 0x82C1, 01942 0x8243, 0x0246, 0x024C, 0x8249, 0x0258, 0x825D, 0x8257, 0x0252, 01943 0x0270, 0x8275, 0x827F, 0x027A, 0x826B, 0x026E, 0x0264, 0x8261, 01944 0x0220, 0x8225, 0x822F, 0x022A, 0x823B, 0x023E, 0x0234, 0x8231, 01945 0x8213, 0x0216, 0x021C, 0x8219, 0x0208, 0x820D, 0x8207, 0x0202 01946 }; 01947 01948 static DRFLAC_INLINE drflac_uint8 drflac_crc8_byte(drflac_uint8 crc, drflac_uint8 data) 01949 { 01950 return drflac__crc8_table[crc ^ data]; 01951 } 01952 01953 static DRFLAC_INLINE drflac_uint8 drflac_crc8(drflac_uint8 crc, drflac_uint32 data, drflac_uint32 count) 01954 { 01955 #ifdef DR_FLAC_NO_CRC 01956 (void)crc; 01957 (void)data; 01958 (void)count; 01959 return 0; 01960 #else 01961 #if 0 01962 /* REFERENCE (use of this implementation requires an explicit flush by doing "drflac_crc8(crc, 0, 8);") */ 01963 drflac_uint8 p = 0x07; 01964 for (int i = count-1; i >= 0; --i) { 01965 drflac_uint8 bit = (data & (1 << i)) >> i; 01966 if (crc & 0x80) { 01967 crc = ((crc << 1) | bit) ^ p; 01968 } else { 01969 crc = ((crc << 1) | bit); 01970 } 01971 } 01972 return crc; 01973 #else 01974 drflac_uint32 wholeBytes; 01975 drflac_uint32 leftoverBits; 01976 drflac_uint64 leftoverDataMask; 01977 01978 static drflac_uint64 leftoverDataMaskTable[8] = { 01979 0x00, 0x01, 0x03, 0x07, 0x0F, 0x1F, 0x3F, 0x7F 01980 }; 01981 01982 DRFLAC_ASSERT(count <= 32); 01983 01984 wholeBytes = count >> 3; 01985 leftoverBits = count - (wholeBytes*8); 01986 leftoverDataMask = leftoverDataMaskTable[leftoverBits]; 01987 01988 switch (wholeBytes) { 01989 case 4: crc = drflac_crc8_byte(crc, (drflac_uint8)((data & (0xFF000000UL << leftoverBits)) >> (24 + leftoverBits))); 01990 case 3: crc = drflac_crc8_byte(crc, (drflac_uint8)((data & (0x00FF0000UL << leftoverBits)) >> (16 + leftoverBits))); 01991 case 2: crc = drflac_crc8_byte(crc, (drflac_uint8)((data & (0x0000FF00UL << leftoverBits)) >> ( 8 + leftoverBits))); 01992 case 1: crc = drflac_crc8_byte(crc, (drflac_uint8)((data & (0x000000FFUL << leftoverBits)) >> ( 0 + leftoverBits))); 01993 case 0: if (leftoverBits > 0) crc = (drflac_uint8)((crc << leftoverBits) ^ drflac__crc8_table[(crc >> (8 - leftoverBits)) ^ (data & leftoverDataMask)]); 01994 } 01995 return crc; 01996 #endif 01997 #endif 01998 } 01999 02000 static DRFLAC_INLINE drflac_uint16 drflac_crc16_byte(drflac_uint16 crc, drflac_uint8 data) 02001 { 02002 return (crc << 8) ^ drflac__crc16_table[(drflac_uint8)(crc >> 8) ^ data]; 02003 } 02004 02005 static DRFLAC_INLINE drflac_uint16 drflac_crc16_cache(drflac_uint16 crc, drflac_cache_t data) 02006 { 02007 #ifdef DRFLAC_64BIT 02008 crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 56) & 0xFF)); 02009 crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 48) & 0xFF)); 02010 crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 40) & 0xFF)); 02011 crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 32) & 0xFF)); 02012 #endif 02013 crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 24) & 0xFF)); 02014 crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 16) & 0xFF)); 02015 crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 8) & 0xFF)); 02016 crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 0) & 0xFF)); 02017 02018 return crc; 02019 } 02020 02021 static DRFLAC_INLINE drflac_uint16 drflac_crc16_bytes(drflac_uint16 crc, drflac_cache_t data, drflac_uint32 byteCount) 02022 { 02023 switch (byteCount) 02024 { 02025 #ifdef DRFLAC_64BIT 02026 case 8: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 56) & 0xFF)); 02027 case 7: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 48) & 0xFF)); 02028 case 6: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 40) & 0xFF)); 02029 case 5: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 32) & 0xFF)); 02030 #endif 02031 case 4: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 24) & 0xFF)); 02032 case 3: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 16) & 0xFF)); 02033 case 2: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 8) & 0xFF)); 02034 case 1: crc = drflac_crc16_byte(crc, (drflac_uint8)((data >> 0) & 0xFF)); 02035 } 02036 02037 return crc; 02038 } 02039 02040 #if 0 02041 static DRFLAC_INLINE drflac_uint16 drflac_crc16__32bit(drflac_uint16 crc, drflac_uint32 data, drflac_uint32 count) 02042 { 02043 #ifdef DR_FLAC_NO_CRC 02044 (void)crc; 02045 (void)data; 02046 (void)count; 02047 return 0; 02048 #else 02049 #if 0 02050 /* REFERENCE (use of this implementation requires an explicit flush by doing "drflac_crc16(crc, 0, 16);") */ 02051 drflac_uint16 p = 0x8005; 02052 for (int i = count-1; i >= 0; --i) { 02053 drflac_uint16 bit = (data & (1ULL << i)) >> i; 02054 if (r & 0x8000) { 02055 r = ((r << 1) | bit) ^ p; 02056 } else { 02057 r = ((r << 1) | bit); 02058 } 02059 } 02060 02061 return crc; 02062 #else 02063 drflac_uint32 wholeBytes; 02064 drflac_uint32 leftoverBits; 02065 drflac_uint64 leftoverDataMask; 02066 02067 static drflac_uint64 leftoverDataMaskTable[8] = { 02068 0x00, 0x01, 0x03, 0x07, 0x0F, 0x1F, 0x3F, 0x7F 02069 }; 02070 02071 DRFLAC_ASSERT(count <= 64); 02072 02073 wholeBytes = count >> 3; 02074 leftoverBits = count & 7; 02075 leftoverDataMask = leftoverDataMaskTable[leftoverBits]; 02076 02077 switch (wholeBytes) { 02078 default: 02079 case 4: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (0xFF000000UL << leftoverBits)) >> (24 + leftoverBits))); 02080 case 3: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (0x00FF0000UL << leftoverBits)) >> (16 + leftoverBits))); 02081 case 2: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (0x0000FF00UL << leftoverBits)) >> ( 8 + leftoverBits))); 02082 case 1: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (0x000000FFUL << leftoverBits)) >> ( 0 + leftoverBits))); 02083 case 0: if (leftoverBits > 0) crc = (crc << leftoverBits) ^ drflac__crc16_table[(crc >> (16 - leftoverBits)) ^ (data & leftoverDataMask)]; 02084 } 02085 return crc; 02086 #endif 02087 #endif 02088 } 02089 02090 static DRFLAC_INLINE drflac_uint16 drflac_crc16__64bit(drflac_uint16 crc, drflac_uint64 data, drflac_uint32 count) 02091 { 02092 #ifdef DR_FLAC_NO_CRC 02093 (void)crc; 02094 (void)data; 02095 (void)count; 02096 return 0; 02097 #else 02098 drflac_uint32 wholeBytes; 02099 drflac_uint32 leftoverBits; 02100 drflac_uint64 leftoverDataMask; 02101 02102 static drflac_uint64 leftoverDataMaskTable[8] = { 02103 0x00, 0x01, 0x03, 0x07, 0x0F, 0x1F, 0x3F, 0x7F 02104 }; 02105 02106 DRFLAC_ASSERT(count <= 64); 02107 02108 wholeBytes = count >> 3; 02109 leftoverBits = count & 7; 02110 leftoverDataMask = leftoverDataMaskTable[leftoverBits]; 02111 02112 switch (wholeBytes) { 02113 default: 02114 case 8: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0xFF000000 << 32) << leftoverBits)) >> (56 + leftoverBits))); /* Weird "<< 32" bitshift is required for C89 because it doesn't support 64-bit constants. Should be optimized out by a good compiler. */ 02115 case 7: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x00FF0000 << 32) << leftoverBits)) >> (48 + leftoverBits))); 02116 case 6: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x0000FF00 << 32) << leftoverBits)) >> (40 + leftoverBits))); 02117 case 5: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x000000FF << 32) << leftoverBits)) >> (32 + leftoverBits))); 02118 case 4: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0xFF000000 ) << leftoverBits)) >> (24 + leftoverBits))); 02119 case 3: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x00FF0000 ) << leftoverBits)) >> (16 + leftoverBits))); 02120 case 2: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x0000FF00 ) << leftoverBits)) >> ( 8 + leftoverBits))); 02121 case 1: crc = drflac_crc16_byte(crc, (drflac_uint8)((data & (((drflac_uint64)0x000000FF ) << leftoverBits)) >> ( 0 + leftoverBits))); 02122 case 0: if (leftoverBits > 0) crc = (crc << leftoverBits) ^ drflac__crc16_table[(crc >> (16 - leftoverBits)) ^ (data & leftoverDataMask)]; 02123 } 02124 return crc; 02125 #endif 02126 } 02127 02128 02129 static DRFLAC_INLINE drflac_uint16 drflac_crc16(drflac_uint16 crc, drflac_cache_t data, drflac_uint32 count) 02130 { 02131 #ifdef DRFLAC_64BIT 02132 return drflac_crc16__64bit(crc, data, count); 02133 #else 02134 return drflac_crc16__32bit(crc, data, count); 02135 #endif 02136 } 02137 #endif 02138 02139 02140 #ifdef DRFLAC_64BIT 02141 #define drflac__be2host__cache_line drflac__be2host_64 02142 #else 02143 #define drflac__be2host__cache_line drflac__be2host_32 02144 #endif 02145 02146 /* 02147 BIT READING ATTEMPT #2 02148 02149 This uses a 32- or 64-bit bit-shifted cache - as bits are read, the cache is shifted such that the first valid bit is sitting 02150 on the most significant bit. It uses the notion of an L1 and L2 cache (borrowed from CPU architecture), where the L1 cache 02151 is a 32- or 64-bit unsigned integer (depending on whether or not a 32- or 64-bit build is being compiled) and the L2 is an 02152 array of "cache lines", with each cache line being the same size as the L1. The L2 is a buffer of about 4KB and is where data 02153 from onRead() is read into. 02154 */ 02155 #define DRFLAC_CACHE_L1_SIZE_BYTES(bs) (sizeof((bs)->cache)) 02156 #define DRFLAC_CACHE_L1_SIZE_BITS(bs) (sizeof((bs)->cache)*8) 02157 #define DRFLAC_CACHE_L1_BITS_REMAINING(bs) (DRFLAC_CACHE_L1_SIZE_BITS(bs) - (bs)->consumedBits) 02158 #define DRFLAC_CACHE_L1_SELECTION_MASK(_bitCount) (~((~(drflac_cache_t)0) >> (_bitCount))) 02159 #define DRFLAC_CACHE_L1_SELECTION_SHIFT(bs, _bitCount) (DRFLAC_CACHE_L1_SIZE_BITS(bs) - (_bitCount)) 02160 #define DRFLAC_CACHE_L1_SELECT(bs, _bitCount) (((bs)->cache) & DRFLAC_CACHE_L1_SELECTION_MASK(_bitCount)) 02161 #define DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, _bitCount) (DRFLAC_CACHE_L1_SELECT((bs), (_bitCount)) >> DRFLAC_CACHE_L1_SELECTION_SHIFT((bs), (_bitCount))) 02162 #define DRFLAC_CACHE_L1_SELECT_AND_SHIFT_SAFE(bs, _bitCount)(DRFLAC_CACHE_L1_SELECT((bs), (_bitCount)) >> (DRFLAC_CACHE_L1_SELECTION_SHIFT((bs), (_bitCount)) & (DRFLAC_CACHE_L1_SIZE_BITS(bs)-1))) 02163 #define DRFLAC_CACHE_L2_SIZE_BYTES(bs) (sizeof((bs)->cacheL2)) 02164 #define DRFLAC_CACHE_L2_LINE_COUNT(bs) (DRFLAC_CACHE_L2_SIZE_BYTES(bs) / sizeof((bs)->cacheL2[0])) 02165 #define DRFLAC_CACHE_L2_LINES_REMAINING(bs) (DRFLAC_CACHE_L2_LINE_COUNT(bs) - (bs)->nextL2Line) 02166 02167 02168 #ifndef DR_FLAC_NO_CRC 02169 static DRFLAC_INLINE void drflac__reset_crc16(drflac_bs* bs) 02170 { 02171 bs->crc16 = 0; 02172 bs->crc16CacheIgnoredBytes = bs->consumedBits >> 3; 02173 } 02174 02175 static DRFLAC_INLINE void drflac__update_crc16(drflac_bs* bs) 02176 { 02177 if (bs->crc16CacheIgnoredBytes == 0) { 02178 bs->crc16 = drflac_crc16_cache(bs->crc16, bs->crc16Cache); 02179 } else { 02180 bs->crc16 = drflac_crc16_bytes(bs->crc16, bs->crc16Cache, DRFLAC_CACHE_L1_SIZE_BYTES(bs) - bs->crc16CacheIgnoredBytes); 02181 bs->crc16CacheIgnoredBytes = 0; 02182 } 02183 } 02184 02185 static DRFLAC_INLINE drflac_uint16 drflac__flush_crc16(drflac_bs* bs) 02186 { 02187 /* We should never be flushing in a situation where we are not aligned on a byte boundary. */ 02188 DRFLAC_ASSERT((DRFLAC_CACHE_L1_BITS_REMAINING(bs) & 7) == 0); 02189 02190 /* 02191 The bits that were read from the L1 cache need to be accumulated. The number of bytes needing to be accumulated is determined 02192 by the number of bits that have been consumed. 02193 */ 02194 if (DRFLAC_CACHE_L1_BITS_REMAINING(bs) == 0) { 02195 drflac__update_crc16(bs); 02196 } else { 02197 /* We only accumulate the consumed bits. */ 02198 bs->crc16 = drflac_crc16_bytes(bs->crc16, bs->crc16Cache >> DRFLAC_CACHE_L1_BITS_REMAINING(bs), (bs->consumedBits >> 3) - bs->crc16CacheIgnoredBytes); 02199 02200 /* 02201 The bits that we just accumulated should never be accumulated again. We need to keep track of how many bytes were accumulated 02202 so we can handle that later. 02203 */ 02204 bs->crc16CacheIgnoredBytes = bs->consumedBits >> 3; 02205 } 02206 02207 return bs->crc16; 02208 } 02209 #endif 02210 02211 static DRFLAC_INLINE drflac_bool32 drflac__reload_l1_cache_from_l2(drflac_bs* bs) 02212 { 02213 size_t bytesRead; 02214 size_t alignedL1LineCount; 02215 02216 /* Fast path. Try loading straight from L2. */ 02217 if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) { 02218 bs->cache = bs->cacheL2[bs->nextL2Line++]; 02219 return DRFLAC_TRUE; 02220 } 02221 02222 /* 02223 If we get here it means we've run out of data in the L2 cache. We'll need to fetch more from the client, if there's 02224 any left. 02225 */ 02226 if (bs->unalignedByteCount > 0) { 02227 return DRFLAC_FALSE; /* If we have any unaligned bytes it means there's no more aligned bytes left in the client. */ 02228 } 02229 02230 bytesRead = bs->onRead(bs->pUserData, bs->cacheL2, DRFLAC_CACHE_L2_SIZE_BYTES(bs)); 02231 02232 bs->nextL2Line = 0; 02233 if (bytesRead == DRFLAC_CACHE_L2_SIZE_BYTES(bs)) { 02234 bs->cache = bs->cacheL2[bs->nextL2Line++]; 02235 return DRFLAC_TRUE; 02236 } 02237 02238 02239 /* 02240 If we get here it means we were unable to retrieve enough data to fill the entire L2 cache. It probably 02241 means we've just reached the end of the file. We need to move the valid data down to the end of the buffer 02242 and adjust the index of the next line accordingly. Also keep in mind that the L2 cache must be aligned to 02243 the size of the L1 so we'll need to seek backwards by any misaligned bytes. 02244 */ 02245 alignedL1LineCount = bytesRead / DRFLAC_CACHE_L1_SIZE_BYTES(bs); 02246 02247 /* We need to keep track of any unaligned bytes for later use. */ 02248 bs->unalignedByteCount = bytesRead - (alignedL1LineCount * DRFLAC_CACHE_L1_SIZE_BYTES(bs)); 02249 if (bs->unalignedByteCount > 0) { 02250 bs->unalignedCache = bs->cacheL2[alignedL1LineCount]; 02251 } 02252 02253 if (alignedL1LineCount > 0) { 02254 size_t offset = DRFLAC_CACHE_L2_LINE_COUNT(bs) - alignedL1LineCount; 02255 size_t i; 02256 for (i = alignedL1LineCount; i > 0; --i) { 02257 bs->cacheL2[i-1 + offset] = bs->cacheL2[i-1]; 02258 } 02259 02260 bs->nextL2Line = (drflac_uint32)offset; 02261 bs->cache = bs->cacheL2[bs->nextL2Line++]; 02262 return DRFLAC_TRUE; 02263 } else { 02264 /* If we get into this branch it means we weren't able to load any L1-aligned data. */ 02265 bs->nextL2Line = DRFLAC_CACHE_L2_LINE_COUNT(bs); 02266 return DRFLAC_FALSE; 02267 } 02268 } 02269 02270 static drflac_bool32 drflac__reload_cache(drflac_bs* bs) 02271 { 02272 size_t bytesRead; 02273 02274 #ifndef DR_FLAC_NO_CRC 02275 drflac__update_crc16(bs); 02276 #endif 02277 02278 /* Fast path. Try just moving the next value in the L2 cache to the L1 cache. */ 02279 if (drflac__reload_l1_cache_from_l2(bs)) { 02280 bs->cache = drflac__be2host__cache_line(bs->cache); 02281 bs->consumedBits = 0; 02282 #ifndef DR_FLAC_NO_CRC 02283 bs->crc16Cache = bs->cache; 02284 #endif 02285 return DRFLAC_TRUE; 02286 } 02287 02288 /* Slow path. */ 02289 02290 /* 02291 If we get here it means we have failed to load the L1 cache from the L2. Likely we've just reached the end of the stream and the last 02292 few bytes did not meet the alignment requirements for the L2 cache. In this case we need to fall back to a slower path and read the 02293 data from the unaligned cache. 02294 */ 02295 bytesRead = bs->unalignedByteCount; 02296 if (bytesRead == 0) { 02297 bs->consumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs); /* <-- The stream has been exhausted, so marked the bits as consumed. */ 02298 return DRFLAC_FALSE; 02299 } 02300 02301 DRFLAC_ASSERT(bytesRead < DRFLAC_CACHE_L1_SIZE_BYTES(bs)); 02302 bs->consumedBits = (drflac_uint32)(DRFLAC_CACHE_L1_SIZE_BYTES(bs) - bytesRead) * 8; 02303 02304 bs->cache = drflac__be2host__cache_line(bs->unalignedCache); 02305 bs->cache &= DRFLAC_CACHE_L1_SELECTION_MASK(DRFLAC_CACHE_L1_BITS_REMAINING(bs)); /* <-- Make sure the consumed bits are always set to zero. Other parts of the library depend on this property. */ 02306 bs->unalignedByteCount = 0; /* <-- At this point the unaligned bytes have been moved into the cache and we thus have no more unaligned bytes. */ 02307 02308 #ifndef DR_FLAC_NO_CRC 02309 bs->crc16Cache = bs->cache >> bs->consumedBits; 02310 bs->crc16CacheIgnoredBytes = bs->consumedBits >> 3; 02311 #endif 02312 return DRFLAC_TRUE; 02313 } 02314 02315 static void drflac__reset_cache(drflac_bs* bs) 02316 { 02317 bs->nextL2Line = DRFLAC_CACHE_L2_LINE_COUNT(bs); /* <-- This clears the L2 cache. */ 02318 bs->consumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs); /* <-- This clears the L1 cache. */ 02319 bs->cache = 0; 02320 bs->unalignedByteCount = 0; /* <-- This clears the trailing unaligned bytes. */ 02321 bs->unalignedCache = 0; 02322 02323 #ifndef DR_FLAC_NO_CRC 02324 bs->crc16Cache = 0; 02325 bs->crc16CacheIgnoredBytes = 0; 02326 #endif 02327 } 02328 02329 02330 static DRFLAC_INLINE drflac_bool32 drflac__read_uint32(drflac_bs* bs, unsigned int bitCount, drflac_uint32* pResultOut) 02331 { 02332 DRFLAC_ASSERT(bs != NULL); 02333 DRFLAC_ASSERT(pResultOut != NULL); 02334 DRFLAC_ASSERT(bitCount > 0); 02335 DRFLAC_ASSERT(bitCount <= 32); 02336 02337 if (bs->consumedBits == DRFLAC_CACHE_L1_SIZE_BITS(bs)) { 02338 if (!drflac__reload_cache(bs)) { 02339 return DRFLAC_FALSE; 02340 } 02341 } 02342 02343 if (bitCount <= DRFLAC_CACHE_L1_BITS_REMAINING(bs)) { 02344 /* 02345 If we want to load all 32-bits from a 32-bit cache we need to do it slightly differently because we can't do 02346 a 32-bit shift on a 32-bit integer. This will never be the case on 64-bit caches, so we can have a slightly 02347 more optimal solution for this. 02348 */ 02349 #ifdef DRFLAC_64BIT 02350 *pResultOut = (drflac_uint32)DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, bitCount); 02351 bs->consumedBits += bitCount; 02352 bs->cache <<= bitCount; 02353 #else 02354 if (bitCount < DRFLAC_CACHE_L1_SIZE_BITS(bs)) { 02355 *pResultOut = (drflac_uint32)DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, bitCount); 02356 bs->consumedBits += bitCount; 02357 bs->cache <<= bitCount; 02358 } else { 02359 /* Cannot shift by 32-bits, so need to do it differently. */ 02360 *pResultOut = (drflac_uint32)bs->cache; 02361 bs->consumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs); 02362 bs->cache = 0; 02363 } 02364 #endif 02365 02366 return DRFLAC_TRUE; 02367 } else { 02368 /* It straddles the cached data. It will never cover more than the next chunk. We just read the number in two parts and combine them. */ 02369 drflac_uint32 bitCountHi = DRFLAC_CACHE_L1_BITS_REMAINING(bs); 02370 drflac_uint32 bitCountLo = bitCount - bitCountHi; 02371 drflac_uint32 resultHi; 02372 02373 DRFLAC_ASSERT(bitCountHi > 0); 02374 DRFLAC_ASSERT(bitCountHi < 32); 02375 resultHi = (drflac_uint32)DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, bitCountHi); 02376 02377 if (!drflac__reload_cache(bs)) { 02378 return DRFLAC_FALSE; 02379 } 02380 02381 *pResultOut = (resultHi << bitCountLo) | (drflac_uint32)DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, bitCountLo); 02382 bs->consumedBits += bitCountLo; 02383 bs->cache <<= bitCountLo; 02384 return DRFLAC_TRUE; 02385 } 02386 } 02387 02388 static drflac_bool32 drflac__read_int32(drflac_bs* bs, unsigned int bitCount, drflac_int32* pResult) 02389 { 02390 drflac_uint32 result; 02391 drflac_uint32 signbit; 02392 02393 DRFLAC_ASSERT(bs != NULL); 02394 DRFLAC_ASSERT(pResult != NULL); 02395 DRFLAC_ASSERT(bitCount > 0); 02396 DRFLAC_ASSERT(bitCount <= 32); 02397 02398 if (!drflac__read_uint32(bs, bitCount, &result)) { 02399 return DRFLAC_FALSE; 02400 } 02401 02402 signbit = ((result >> (bitCount-1)) & 0x01); 02403 result |= (~signbit + 1) << bitCount; 02404 02405 *pResult = (drflac_int32)result; 02406 return DRFLAC_TRUE; 02407 } 02408 02409 #ifdef DRFLAC_64BIT 02410 static drflac_bool32 drflac__read_uint64(drflac_bs* bs, unsigned int bitCount, drflac_uint64* pResultOut) 02411 { 02412 drflac_uint32 resultHi; 02413 drflac_uint32 resultLo; 02414 02415 DRFLAC_ASSERT(bitCount <= 64); 02416 DRFLAC_ASSERT(bitCount > 32); 02417 02418 if (!drflac__read_uint32(bs, bitCount - 32, &resultHi)) { 02419 return DRFLAC_FALSE; 02420 } 02421 02422 if (!drflac__read_uint32(bs, 32, &resultLo)) { 02423 return DRFLAC_FALSE; 02424 } 02425 02426 *pResultOut = (((drflac_uint64)resultHi) << 32) | ((drflac_uint64)resultLo); 02427 return DRFLAC_TRUE; 02428 } 02429 #endif 02430 02431 /* Function below is unused, but leaving it here in case I need to quickly add it again. */ 02432 #if 0 02433 static drflac_bool32 drflac__read_int64(drflac_bs* bs, unsigned int bitCount, drflac_int64* pResultOut) 02434 { 02435 drflac_uint64 result; 02436 drflac_uint64 signbit; 02437 02438 DRFLAC_ASSERT(bitCount <= 64); 02439 02440 if (!drflac__read_uint64(bs, bitCount, &result)) { 02441 return DRFLAC_FALSE; 02442 } 02443 02444 signbit = ((result >> (bitCount-1)) & 0x01); 02445 result |= (~signbit + 1) << bitCount; 02446 02447 *pResultOut = (drflac_int64)result; 02448 return DRFLAC_TRUE; 02449 } 02450 #endif 02451 02452 static drflac_bool32 drflac__read_uint16(drflac_bs* bs, unsigned int bitCount, drflac_uint16* pResult) 02453 { 02454 drflac_uint32 result; 02455 02456 DRFLAC_ASSERT(bs != NULL); 02457 DRFLAC_ASSERT(pResult != NULL); 02458 DRFLAC_ASSERT(bitCount > 0); 02459 DRFLAC_ASSERT(bitCount <= 16); 02460 02461 if (!drflac__read_uint32(bs, bitCount, &result)) { 02462 return DRFLAC_FALSE; 02463 } 02464 02465 *pResult = (drflac_uint16)result; 02466 return DRFLAC_TRUE; 02467 } 02468 02469 #if 0 02470 static drflac_bool32 drflac__read_int16(drflac_bs* bs, unsigned int bitCount, drflac_int16* pResult) 02471 { 02472 drflac_int32 result; 02473 02474 DRFLAC_ASSERT(bs != NULL); 02475 DRFLAC_ASSERT(pResult != NULL); 02476 DRFLAC_ASSERT(bitCount > 0); 02477 DRFLAC_ASSERT(bitCount <= 16); 02478 02479 if (!drflac__read_int32(bs, bitCount, &result)) { 02480 return DRFLAC_FALSE; 02481 } 02482 02483 *pResult = (drflac_int16)result; 02484 return DRFLAC_TRUE; 02485 } 02486 #endif 02487 02488 static drflac_bool32 drflac__read_uint8(drflac_bs* bs, unsigned int bitCount, drflac_uint8* pResult) 02489 { 02490 drflac_uint32 result; 02491 02492 DRFLAC_ASSERT(bs != NULL); 02493 DRFLAC_ASSERT(pResult != NULL); 02494 DRFLAC_ASSERT(bitCount > 0); 02495 DRFLAC_ASSERT(bitCount <= 8); 02496 02497 if (!drflac__read_uint32(bs, bitCount, &result)) { 02498 return DRFLAC_FALSE; 02499 } 02500 02501 *pResult = (drflac_uint8)result; 02502 return DRFLAC_TRUE; 02503 } 02504 02505 static drflac_bool32 drflac__read_int8(drflac_bs* bs, unsigned int bitCount, drflac_int8* pResult) 02506 { 02507 drflac_int32 result; 02508 02509 DRFLAC_ASSERT(bs != NULL); 02510 DRFLAC_ASSERT(pResult != NULL); 02511 DRFLAC_ASSERT(bitCount > 0); 02512 DRFLAC_ASSERT(bitCount <= 8); 02513 02514 if (!drflac__read_int32(bs, bitCount, &result)) { 02515 return DRFLAC_FALSE; 02516 } 02517 02518 *pResult = (drflac_int8)result; 02519 return DRFLAC_TRUE; 02520 } 02521 02522 02523 static drflac_bool32 drflac__seek_bits(drflac_bs* bs, size_t bitsToSeek) 02524 { 02525 if (bitsToSeek <= DRFLAC_CACHE_L1_BITS_REMAINING(bs)) { 02526 bs->consumedBits += (drflac_uint32)bitsToSeek; 02527 bs->cache <<= bitsToSeek; 02528 return DRFLAC_TRUE; 02529 } else { 02530 /* It straddles the cached data. This function isn't called too frequently so I'm favouring simplicity here. */ 02531 bitsToSeek -= DRFLAC_CACHE_L1_BITS_REMAINING(bs); 02532 bs->consumedBits += DRFLAC_CACHE_L1_BITS_REMAINING(bs); 02533 bs->cache = 0; 02534 02535 /* Simple case. Seek in groups of the same number as bits that fit within a cache line. */ 02536 #ifdef DRFLAC_64BIT 02537 while (bitsToSeek >= DRFLAC_CACHE_L1_SIZE_BITS(bs)) { 02538 drflac_uint64 bin; 02539 if (!drflac__read_uint64(bs, DRFLAC_CACHE_L1_SIZE_BITS(bs), &bin)) { 02540 return DRFLAC_FALSE; 02541 } 02542 bitsToSeek -= DRFLAC_CACHE_L1_SIZE_BITS(bs); 02543 } 02544 #else 02545 while (bitsToSeek >= DRFLAC_CACHE_L1_SIZE_BITS(bs)) { 02546 drflac_uint32 bin; 02547 if (!drflac__read_uint32(bs, DRFLAC_CACHE_L1_SIZE_BITS(bs), &bin)) { 02548 return DRFLAC_FALSE; 02549 } 02550 bitsToSeek -= DRFLAC_CACHE_L1_SIZE_BITS(bs); 02551 } 02552 #endif 02553 02554 /* Whole leftover bytes. */ 02555 while (bitsToSeek >= 8) { 02556 drflac_uint8 bin; 02557 if (!drflac__read_uint8(bs, 8, &bin)) { 02558 return DRFLAC_FALSE; 02559 } 02560 bitsToSeek -= 8; 02561 } 02562 02563 /* Leftover bits. */ 02564 if (bitsToSeek > 0) { 02565 drflac_uint8 bin; 02566 if (!drflac__read_uint8(bs, (drflac_uint32)bitsToSeek, &bin)) { 02567 return DRFLAC_FALSE; 02568 } 02569 bitsToSeek = 0; /* <-- Necessary for the assert below. */ 02570 } 02571 02572 DRFLAC_ASSERT(bitsToSeek == 0); 02573 return DRFLAC_TRUE; 02574 } 02575 } 02576 02577 02578 /* This function moves the bit streamer to the first bit after the sync code (bit 15 of the of the frame header). It will also update the CRC-16. */ 02579 static drflac_bool32 drflac__find_and_seek_to_next_sync_code(drflac_bs* bs) 02580 { 02581 DRFLAC_ASSERT(bs != NULL); 02582 02583 /* 02584 The sync code is always aligned to 8 bits. This is convenient for us because it means we can do byte-aligned movements. The first 02585 thing to do is align to the next byte. 02586 */ 02587 if (!drflac__seek_bits(bs, DRFLAC_CACHE_L1_BITS_REMAINING(bs) & 7)) { 02588 return DRFLAC_FALSE; 02589 } 02590 02591 for (;;) { 02592 drflac_uint8 hi; 02593 02594 #ifndef DR_FLAC_NO_CRC 02595 drflac__reset_crc16(bs); 02596 #endif 02597 02598 if (!drflac__read_uint8(bs, 8, &hi)) { 02599 return DRFLAC_FALSE; 02600 } 02601 02602 if (hi == 0xFF) { 02603 drflac_uint8 lo; 02604 if (!drflac__read_uint8(bs, 6, &lo)) { 02605 return DRFLAC_FALSE; 02606 } 02607 02608 if (lo == 0x3E) { 02609 return DRFLAC_TRUE; 02610 } else { 02611 if (!drflac__seek_bits(bs, DRFLAC_CACHE_L1_BITS_REMAINING(bs) & 7)) { 02612 return DRFLAC_FALSE; 02613 } 02614 } 02615 } 02616 } 02617 02618 /* Should never get here. */ 02619 /*return DRFLAC_FALSE;*/ 02620 } 02621 02622 02623 #if defined(DRFLAC_HAS_LZCNT_INTRINSIC) 02624 #define DRFLAC_IMPLEMENT_CLZ_LZCNT 02625 #endif 02626 #if defined(_MSC_VER) && _MSC_VER >= 1400 && (defined(DRFLAC_X64) || defined(DRFLAC_X86)) 02627 #define DRFLAC_IMPLEMENT_CLZ_MSVC 02628 #endif 02629 02630 static DRFLAC_INLINE drflac_uint32 drflac__clz_software(drflac_cache_t x) 02631 { 02632 drflac_uint32 n; 02633 static drflac_uint32 clz_table_4[] = { 02634 0, 02635 4, 02636 3, 3, 02637 2, 2, 2, 2, 02638 1, 1, 1, 1, 1, 1, 1, 1 02639 }; 02640 02641 if (x == 0) { 02642 return sizeof(x)*8; 02643 } 02644 02645 n = clz_table_4[x >> (sizeof(x)*8 - 4)]; 02646 if (n == 0) { 02647 #ifdef DRFLAC_64BIT 02648 if ((x & ((drflac_uint64)0xFFFFFFFF << 32)) == 0) { n = 32; x <<= 32; } 02649 if ((x & ((drflac_uint64)0xFFFF0000 << 32)) == 0) { n += 16; x <<= 16; } 02650 if ((x & ((drflac_uint64)0xFF000000 << 32)) == 0) { n += 8; x <<= 8; } 02651 if ((x & ((drflac_uint64)0xF0000000 << 32)) == 0) { n += 4; x <<= 4; } 02652 #else 02653 if ((x & 0xFFFF0000) == 0) { n = 16; x <<= 16; } 02654 if ((x & 0xFF000000) == 0) { n += 8; x <<= 8; } 02655 if ((x & 0xF0000000) == 0) { n += 4; x <<= 4; } 02656 #endif 02657 n += clz_table_4[x >> (sizeof(x)*8 - 4)]; 02658 } 02659 02660 return n - 1; 02661 } 02662 02663 #ifdef DRFLAC_IMPLEMENT_CLZ_LZCNT 02664 static DRFLAC_INLINE drflac_bool32 drflac__is_lzcnt_supported(void) 02665 { 02666 /* Fast compile time check for ARM. */ 02667 #if defined(DRFLAC_HAS_LZCNT_INTRINSIC) && defined(DRFLAC_ARM) && (defined(__ARM_ARCH) && __ARM_ARCH >= 5) 02668 return DRFLAC_TRUE; 02669 #else 02670 /* If the compiler itself does not support the intrinsic then we'll need to return false. */ 02671 #ifdef DRFLAC_HAS_LZCNT_INTRINSIC 02672 return drflac__gIsLZCNTSupported; 02673 #else 02674 return DRFLAC_FALSE; 02675 #endif 02676 #endif 02677 } 02678 02679 static DRFLAC_INLINE drflac_uint32 drflac__clz_lzcnt(drflac_cache_t x) 02680 { 02681 #if defined(_MSC_VER) && !defined(__clang__) 02682 #ifdef DRFLAC_64BIT 02683 return (drflac_uint32)__lzcnt64(x); 02684 #else 02685 return (drflac_uint32)__lzcnt(x); 02686 #endif 02687 #else 02688 #if defined(__GNUC__) || defined(__clang__) 02689 #if defined(DRFLAC_X64) 02690 { 02691 drflac_uint64 r; 02692 __asm__ __volatile__ ( 02693 "lzcnt{ %1, %0| %0, %1}" : "=r"(r) : "r"(x) 02694 ); 02695 02696 return (drflac_uint32)r; 02697 } 02698 #elif defined(DRFLAC_X86) 02699 { 02700 drflac_uint32 r; 02701 __asm__ __volatile__ ( 02702 "lzcnt{l %1, %0| %0, %1}" : "=r"(r) : "r"(x) 02703 ); 02704 02705 return r; 02706 } 02707 #elif defined(DRFLAC_ARM) && (defined(__ARM_ARCH) && __ARM_ARCH >= 5) && !defined(DRFLAC_64BIT) /* <-- I haven't tested 64-bit inline assembly, so only enabling this for the 32-bit build for now. */ 02708 { 02709 unsigned int r; 02710 __asm__ __volatile__ ( 02711 #if defined(DRFLAC_64BIT) 02712 "clz %w[out], %w[in]" : [out]"=r"(r) : [in]"r"(x) /* <-- This is untested. If someone in the community could test this, that would be appreciated! */ 02713 #else 02714 "clz %[out], %[in]" : [out]"=r"(r) : [in]"r"(x) 02715 #endif 02716 ); 02717 02718 return r; 02719 } 02720 #else 02721 if (x == 0) { 02722 return sizeof(x)*8; 02723 } 02724 #ifdef DRFLAC_64BIT 02725 return (drflac_uint32)__builtin_clzll((drflac_uint64)x); 02726 #else 02727 return (drflac_uint32)__builtin_clzl((drflac_uint32)x); 02728 #endif 02729 #endif 02730 #else 02731 /* Unsupported compiler. */ 02732 #error "This compiler does not support the lzcnt intrinsic." 02733 #endif 02734 #endif 02735 } 02736 #endif 02737 02738 #ifdef DRFLAC_IMPLEMENT_CLZ_MSVC 02739 #include <intrin.h> /* For BitScanReverse(). */ 02740 02741 static DRFLAC_INLINE drflac_uint32 drflac__clz_msvc(drflac_cache_t x) 02742 { 02743 drflac_uint32 n; 02744 02745 if (x == 0) { 02746 return sizeof(x)*8; 02747 } 02748 02749 #ifdef DRFLAC_64BIT 02750 _BitScanReverse64((unsigned long*)&n, x); 02751 #else 02752 _BitScanReverse((unsigned long*)&n, x); 02753 #endif 02754 return sizeof(x)*8 - n - 1; 02755 } 02756 #endif 02757 02758 static DRFLAC_INLINE drflac_uint32 drflac__clz(drflac_cache_t x) 02759 { 02760 #ifdef DRFLAC_IMPLEMENT_CLZ_LZCNT 02761 if (drflac__is_lzcnt_supported()) { 02762 return drflac__clz_lzcnt(x); 02763 } else 02764 #endif 02765 { 02766 #ifdef DRFLAC_IMPLEMENT_CLZ_MSVC 02767 return drflac__clz_msvc(x); 02768 #else 02769 return drflac__clz_software(x); 02770 #endif 02771 } 02772 } 02773 02774 02775 static DRFLAC_INLINE drflac_bool32 drflac__seek_past_next_set_bit(drflac_bs* bs, unsigned int* pOffsetOut) 02776 { 02777 drflac_uint32 zeroCounter = 0; 02778 drflac_uint32 setBitOffsetPlus1; 02779 02780 while (bs->cache == 0) { 02781 zeroCounter += (drflac_uint32)DRFLAC_CACHE_L1_BITS_REMAINING(bs); 02782 if (!drflac__reload_cache(bs)) { 02783 return DRFLAC_FALSE; 02784 } 02785 } 02786 02787 setBitOffsetPlus1 = drflac__clz(bs->cache); 02788 setBitOffsetPlus1 += 1; 02789 02790 bs->consumedBits += setBitOffsetPlus1; 02791 bs->cache <<= setBitOffsetPlus1; 02792 02793 *pOffsetOut = zeroCounter + setBitOffsetPlus1 - 1; 02794 return DRFLAC_TRUE; 02795 } 02796 02797 02798 02799 static drflac_bool32 drflac__seek_to_byte(drflac_bs* bs, drflac_uint64 offsetFromStart) 02800 { 02801 DRFLAC_ASSERT(bs != NULL); 02802 DRFLAC_ASSERT(offsetFromStart > 0); 02803 02804 /* 02805 Seeking from the start is not quite as trivial as it sounds because the onSeek callback takes a signed 32-bit integer (which 02806 is intentional because it simplifies the implementation of the onSeek callbacks), however offsetFromStart is unsigned 64-bit. 02807 To resolve we just need to do an initial seek from the start, and then a series of offset seeks to make up the remainder. 02808 */ 02809 if (offsetFromStart > 0x7FFFFFFF) { 02810 drflac_uint64 bytesRemaining = offsetFromStart; 02811 if (!bs->onSeek(bs->pUserData, 0x7FFFFFFF, drflac_seek_origin_start)) { 02812 return DRFLAC_FALSE; 02813 } 02814 bytesRemaining -= 0x7FFFFFFF; 02815 02816 while (bytesRemaining > 0x7FFFFFFF) { 02817 if (!bs->onSeek(bs->pUserData, 0x7FFFFFFF, drflac_seek_origin_current)) { 02818 return DRFLAC_FALSE; 02819 } 02820 bytesRemaining -= 0x7FFFFFFF; 02821 } 02822 02823 if (bytesRemaining > 0) { 02824 if (!bs->onSeek(bs->pUserData, (int)bytesRemaining, drflac_seek_origin_current)) { 02825 return DRFLAC_FALSE; 02826 } 02827 } 02828 } else { 02829 if (!bs->onSeek(bs->pUserData, (int)offsetFromStart, drflac_seek_origin_start)) { 02830 return DRFLAC_FALSE; 02831 } 02832 } 02833 02834 /* The cache should be reset to force a reload of fresh data from the client. */ 02835 drflac__reset_cache(bs); 02836 return DRFLAC_TRUE; 02837 } 02838 02839 02840 static drflac_result drflac__read_utf8_coded_number(drflac_bs* bs, drflac_uint64* pNumberOut, drflac_uint8* pCRCOut) 02841 { 02842 drflac_uint8 crc; 02843 drflac_uint64 result; 02844 drflac_uint8 utf8[7] = {0}; 02845 int byteCount; 02846 int i; 02847 02848 DRFLAC_ASSERT(bs != NULL); 02849 DRFLAC_ASSERT(pNumberOut != NULL); 02850 DRFLAC_ASSERT(pCRCOut != NULL); 02851 02852 crc = *pCRCOut; 02853 02854 if (!drflac__read_uint8(bs, 8, utf8)) { 02855 *pNumberOut = 0; 02856 return DRFLAC_AT_END; 02857 } 02858 crc = drflac_crc8(crc, utf8[0], 8); 02859 02860 if ((utf8[0] & 0x80) == 0) { 02861 *pNumberOut = utf8[0]; 02862 *pCRCOut = crc; 02863 return DRFLAC_SUCCESS; 02864 } 02865 02866 /*byteCount = 1;*/ 02867 if ((utf8[0] & 0xE0) == 0xC0) { 02868 byteCount = 2; 02869 } else if ((utf8[0] & 0xF0) == 0xE0) { 02870 byteCount = 3; 02871 } else if ((utf8[0] & 0xF8) == 0xF0) { 02872 byteCount = 4; 02873 } else if ((utf8[0] & 0xFC) == 0xF8) { 02874 byteCount = 5; 02875 } else if ((utf8[0] & 0xFE) == 0xFC) { 02876 byteCount = 6; 02877 } else if ((utf8[0] & 0xFF) == 0xFE) { 02878 byteCount = 7; 02879 } else { 02880 *pNumberOut = 0; 02881 return DRFLAC_CRC_MISMATCH; /* Bad UTF-8 encoding. */ 02882 } 02883 02884 /* Read extra bytes. */ 02885 DRFLAC_ASSERT(byteCount > 1); 02886 02887 result = (drflac_uint64)(utf8[0] & (0xFF >> (byteCount + 1))); 02888 for (i = 1; i < byteCount; ++i) { 02889 if (!drflac__read_uint8(bs, 8, utf8 + i)) { 02890 *pNumberOut = 0; 02891 return DRFLAC_AT_END; 02892 } 02893 crc = drflac_crc8(crc, utf8[i], 8); 02894 02895 result = (result << 6) | (utf8[i] & 0x3F); 02896 } 02897 02898 *pNumberOut = result; 02899 *pCRCOut = crc; 02900 return DRFLAC_SUCCESS; 02901 } 02902 02903 02904 02905 /* 02906 The next two functions are responsible for calculating the prediction. 02907 02908 When the bits per sample is >16 we need to use 64-bit integer arithmetic because otherwise we'll run out of precision. It's 02909 safe to assume this will be slower on 32-bit platforms so we use a more optimal solution when the bits per sample is <=16. 02910 */ 02911 static DRFLAC_INLINE drflac_int32 drflac__calculate_prediction_32(drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pDecodedSamples) 02912 { 02913 drflac_int32 prediction = 0; 02914 02915 DRFLAC_ASSERT(order <= 32); 02916 02917 /* 32-bit version. */ 02918 02919 /* VC++ optimizes this to a single jmp. I've not yet verified this for other compilers. */ 02920 switch (order) 02921 { 02922 case 32: prediction += coefficients[31] * pDecodedSamples[-32]; 02923 case 31: prediction += coefficients[30] * pDecodedSamples[-31]; 02924 case 30: prediction += coefficients[29] * pDecodedSamples[-30]; 02925 case 29: prediction += coefficients[28] * pDecodedSamples[-29]; 02926 case 28: prediction += coefficients[27] * pDecodedSamples[-28]; 02927 case 27: prediction += coefficients[26] * pDecodedSamples[-27]; 02928 case 26: prediction += coefficients[25] * pDecodedSamples[-26]; 02929 case 25: prediction += coefficients[24] * pDecodedSamples[-25]; 02930 case 24: prediction += coefficients[23] * pDecodedSamples[-24]; 02931 case 23: prediction += coefficients[22] * pDecodedSamples[-23]; 02932 case 22: prediction += coefficients[21] * pDecodedSamples[-22]; 02933 case 21: prediction += coefficients[20] * pDecodedSamples[-21]; 02934 case 20: prediction += coefficients[19] * pDecodedSamples[-20]; 02935 case 19: prediction += coefficients[18] * pDecodedSamples[-19]; 02936 case 18: prediction += coefficients[17] * pDecodedSamples[-18]; 02937 case 17: prediction += coefficients[16] * pDecodedSamples[-17]; 02938 case 16: prediction += coefficients[15] * pDecodedSamples[-16]; 02939 case 15: prediction += coefficients[14] * pDecodedSamples[-15]; 02940 case 14: prediction += coefficients[13] * pDecodedSamples[-14]; 02941 case 13: prediction += coefficients[12] * pDecodedSamples[-13]; 02942 case 12: prediction += coefficients[11] * pDecodedSamples[-12]; 02943 case 11: prediction += coefficients[10] * pDecodedSamples[-11]; 02944 case 10: prediction += coefficients[ 9] * pDecodedSamples[-10]; 02945 case 9: prediction += coefficients[ 8] * pDecodedSamples[- 9]; 02946 case 8: prediction += coefficients[ 7] * pDecodedSamples[- 8]; 02947 case 7: prediction += coefficients[ 6] * pDecodedSamples[- 7]; 02948 case 6: prediction += coefficients[ 5] * pDecodedSamples[- 6]; 02949 case 5: prediction += coefficients[ 4] * pDecodedSamples[- 5]; 02950 case 4: prediction += coefficients[ 3] * pDecodedSamples[- 4]; 02951 case 3: prediction += coefficients[ 2] * pDecodedSamples[- 3]; 02952 case 2: prediction += coefficients[ 1] * pDecodedSamples[- 2]; 02953 case 1: prediction += coefficients[ 0] * pDecodedSamples[- 1]; 02954 } 02955 02956 return (drflac_int32)(prediction >> shift); 02957 } 02958 02959 static DRFLAC_INLINE drflac_int32 drflac__calculate_prediction_64(drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pDecodedSamples) 02960 { 02961 drflac_int64 prediction; 02962 02963 DRFLAC_ASSERT(order <= 32); 02964 02965 /* 64-bit version. */ 02966 02967 /* This method is faster on the 32-bit build when compiling with VC++. See note below. */ 02968 #ifndef DRFLAC_64BIT 02969 if (order == 8) 02970 { 02971 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 02972 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 02973 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 02974 prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4]; 02975 prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5]; 02976 prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6]; 02977 prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7]; 02978 prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8]; 02979 } 02980 else if (order == 7) 02981 { 02982 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 02983 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 02984 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 02985 prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4]; 02986 prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5]; 02987 prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6]; 02988 prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7]; 02989 } 02990 else if (order == 3) 02991 { 02992 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 02993 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 02994 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 02995 } 02996 else if (order == 6) 02997 { 02998 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 02999 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 03000 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 03001 prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4]; 03002 prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5]; 03003 prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6]; 03004 } 03005 else if (order == 5) 03006 { 03007 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 03008 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 03009 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 03010 prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4]; 03011 prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5]; 03012 } 03013 else if (order == 4) 03014 { 03015 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 03016 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 03017 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 03018 prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4]; 03019 } 03020 else if (order == 12) 03021 { 03022 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 03023 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 03024 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 03025 prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4]; 03026 prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5]; 03027 prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6]; 03028 prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7]; 03029 prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8]; 03030 prediction += coefficients[8] * (drflac_int64)pDecodedSamples[-9]; 03031 prediction += coefficients[9] * (drflac_int64)pDecodedSamples[-10]; 03032 prediction += coefficients[10] * (drflac_int64)pDecodedSamples[-11]; 03033 prediction += coefficients[11] * (drflac_int64)pDecodedSamples[-12]; 03034 } 03035 else if (order == 2) 03036 { 03037 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 03038 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 03039 } 03040 else if (order == 1) 03041 { 03042 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 03043 } 03044 else if (order == 10) 03045 { 03046 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 03047 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 03048 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 03049 prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4]; 03050 prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5]; 03051 prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6]; 03052 prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7]; 03053 prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8]; 03054 prediction += coefficients[8] * (drflac_int64)pDecodedSamples[-9]; 03055 prediction += coefficients[9] * (drflac_int64)pDecodedSamples[-10]; 03056 } 03057 else if (order == 9) 03058 { 03059 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 03060 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 03061 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 03062 prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4]; 03063 prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5]; 03064 prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6]; 03065 prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7]; 03066 prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8]; 03067 prediction += coefficients[8] * (drflac_int64)pDecodedSamples[-9]; 03068 } 03069 else if (order == 11) 03070 { 03071 prediction = coefficients[0] * (drflac_int64)pDecodedSamples[-1]; 03072 prediction += coefficients[1] * (drflac_int64)pDecodedSamples[-2]; 03073 prediction += coefficients[2] * (drflac_int64)pDecodedSamples[-3]; 03074 prediction += coefficients[3] * (drflac_int64)pDecodedSamples[-4]; 03075 prediction += coefficients[4] * (drflac_int64)pDecodedSamples[-5]; 03076 prediction += coefficients[5] * (drflac_int64)pDecodedSamples[-6]; 03077 prediction += coefficients[6] * (drflac_int64)pDecodedSamples[-7]; 03078 prediction += coefficients[7] * (drflac_int64)pDecodedSamples[-8]; 03079 prediction += coefficients[8] * (drflac_int64)pDecodedSamples[-9]; 03080 prediction += coefficients[9] * (drflac_int64)pDecodedSamples[-10]; 03081 prediction += coefficients[10] * (drflac_int64)pDecodedSamples[-11]; 03082 } 03083 else 03084 { 03085 int j; 03086 03087 prediction = 0; 03088 for (j = 0; j < (int)order; ++j) { 03089 prediction += coefficients[j] * (drflac_int64)pDecodedSamples[-j-1]; 03090 } 03091 } 03092 #endif 03093 03094 /* 03095 VC++ optimizes this to a single jmp instruction, but only the 64-bit build. The 32-bit build generates less efficient code for some 03096 reason. The ugly version above is faster so we'll just switch between the two depending on the target platform. 03097 */ 03098 #ifdef DRFLAC_64BIT 03099 prediction = 0; 03100 switch (order) 03101 { 03102 case 32: prediction += coefficients[31] * (drflac_int64)pDecodedSamples[-32]; 03103 case 31: prediction += coefficients[30] * (drflac_int64)pDecodedSamples[-31]; 03104 case 30: prediction += coefficients[29] * (drflac_int64)pDecodedSamples[-30]; 03105 case 29: prediction += coefficients[28] * (drflac_int64)pDecodedSamples[-29]; 03106 case 28: prediction += coefficients[27] * (drflac_int64)pDecodedSamples[-28]; 03107 case 27: prediction += coefficients[26] * (drflac_int64)pDecodedSamples[-27]; 03108 case 26: prediction += coefficients[25] * (drflac_int64)pDecodedSamples[-26]; 03109 case 25: prediction += coefficients[24] * (drflac_int64)pDecodedSamples[-25]; 03110 case 24: prediction += coefficients[23] * (drflac_int64)pDecodedSamples[-24]; 03111 case 23: prediction += coefficients[22] * (drflac_int64)pDecodedSamples[-23]; 03112 case 22: prediction += coefficients[21] * (drflac_int64)pDecodedSamples[-22]; 03113 case 21: prediction += coefficients[20] * (drflac_int64)pDecodedSamples[-21]; 03114 case 20: prediction += coefficients[19] * (drflac_int64)pDecodedSamples[-20]; 03115 case 19: prediction += coefficients[18] * (drflac_int64)pDecodedSamples[-19]; 03116 case 18: prediction += coefficients[17] * (drflac_int64)pDecodedSamples[-18]; 03117 case 17: prediction += coefficients[16] * (drflac_int64)pDecodedSamples[-17]; 03118 case 16: prediction += coefficients[15] * (drflac_int64)pDecodedSamples[-16]; 03119 case 15: prediction += coefficients[14] * (drflac_int64)pDecodedSamples[-15]; 03120 case 14: prediction += coefficients[13] * (drflac_int64)pDecodedSamples[-14]; 03121 case 13: prediction += coefficients[12] * (drflac_int64)pDecodedSamples[-13]; 03122 case 12: prediction += coefficients[11] * (drflac_int64)pDecodedSamples[-12]; 03123 case 11: prediction += coefficients[10] * (drflac_int64)pDecodedSamples[-11]; 03124 case 10: prediction += coefficients[ 9] * (drflac_int64)pDecodedSamples[-10]; 03125 case 9: prediction += coefficients[ 8] * (drflac_int64)pDecodedSamples[- 9]; 03126 case 8: prediction += coefficients[ 7] * (drflac_int64)pDecodedSamples[- 8]; 03127 case 7: prediction += coefficients[ 6] * (drflac_int64)pDecodedSamples[- 7]; 03128 case 6: prediction += coefficients[ 5] * (drflac_int64)pDecodedSamples[- 6]; 03129 case 5: prediction += coefficients[ 4] * (drflac_int64)pDecodedSamples[- 5]; 03130 case 4: prediction += coefficients[ 3] * (drflac_int64)pDecodedSamples[- 4]; 03131 case 3: prediction += coefficients[ 2] * (drflac_int64)pDecodedSamples[- 3]; 03132 case 2: prediction += coefficients[ 1] * (drflac_int64)pDecodedSamples[- 2]; 03133 case 1: prediction += coefficients[ 0] * (drflac_int64)pDecodedSamples[- 1]; 03134 } 03135 #endif 03136 03137 return (drflac_int32)(prediction >> shift); 03138 } 03139 03140 03141 #if 0 03142 /* 03143 Reference implementation for reading and decoding samples with residual. This is intentionally left unoptimized for the 03144 sake of readability and should only be used as a reference. 03145 */ 03146 static drflac_bool32 drflac__decode_samples_with_residual__rice__reference(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 03147 { 03148 drflac_uint32 i; 03149 03150 DRFLAC_ASSERT(bs != NULL); 03151 DRFLAC_ASSERT(count > 0); 03152 DRFLAC_ASSERT(pSamplesOut != NULL); 03153 03154 for (i = 0; i < count; ++i) { 03155 drflac_uint32 zeroCounter = 0; 03156 for (;;) { 03157 drflac_uint8 bit; 03158 if (!drflac__read_uint8(bs, 1, &bit)) { 03159 return DRFLAC_FALSE; 03160 } 03161 03162 if (bit == 0) { 03163 zeroCounter += 1; 03164 } else { 03165 break; 03166 } 03167 } 03168 03169 drflac_uint32 decodedRice; 03170 if (riceParam > 0) { 03171 if (!drflac__read_uint32(bs, riceParam, &decodedRice)) { 03172 return DRFLAC_FALSE; 03173 } 03174 } else { 03175 decodedRice = 0; 03176 } 03177 03178 decodedRice |= (zeroCounter << riceParam); 03179 if ((decodedRice & 0x01)) { 03180 decodedRice = ~(decodedRice >> 1); 03181 } else { 03182 decodedRice = (decodedRice >> 1); 03183 } 03184 03185 03186 if (bitsPerSample+shift >= 32) { 03187 pSamplesOut[i] = decodedRice + drflac__calculate_prediction_64(order, shift, coefficients, pSamplesOut + i); 03188 } else { 03189 pSamplesOut[i] = decodedRice + drflac__calculate_prediction_32(order, shift, coefficients, pSamplesOut + i); 03190 } 03191 } 03192 03193 return DRFLAC_TRUE; 03194 } 03195 #endif 03196 03197 #if 0 03198 static drflac_bool32 drflac__read_rice_parts__reference(drflac_bs* bs, drflac_uint8 riceParam, drflac_uint32* pZeroCounterOut, drflac_uint32* pRiceParamPartOut) 03199 { 03200 drflac_uint32 zeroCounter = 0; 03201 drflac_uint32 decodedRice; 03202 03203 for (;;) { 03204 drflac_uint8 bit; 03205 if (!drflac__read_uint8(bs, 1, &bit)) { 03206 return DRFLAC_FALSE; 03207 } 03208 03209 if (bit == 0) { 03210 zeroCounter += 1; 03211 } else { 03212 break; 03213 } 03214 } 03215 03216 if (riceParam > 0) { 03217 if (!drflac__read_uint32(bs, riceParam, &decodedRice)) { 03218 return DRFLAC_FALSE; 03219 } 03220 } else { 03221 decodedRice = 0; 03222 } 03223 03224 *pZeroCounterOut = zeroCounter; 03225 *pRiceParamPartOut = decodedRice; 03226 return DRFLAC_TRUE; 03227 } 03228 #endif 03229 03230 #if 0 03231 static DRFLAC_INLINE drflac_bool32 drflac__read_rice_parts(drflac_bs* bs, drflac_uint8 riceParam, drflac_uint32* pZeroCounterOut, drflac_uint32* pRiceParamPartOut) 03232 { 03233 drflac_cache_t riceParamMask; 03234 drflac_uint32 zeroCounter; 03235 drflac_uint32 setBitOffsetPlus1; 03236 drflac_uint32 riceParamPart; 03237 drflac_uint32 riceLength; 03238 03239 DRFLAC_ASSERT(riceParam > 0); /* <-- riceParam should never be 0. drflac__read_rice_parts__param_equals_zero() should be used instead for this case. */ 03240 03241 riceParamMask = DRFLAC_CACHE_L1_SELECTION_MASK(riceParam); 03242 03243 zeroCounter = 0; 03244 while (bs->cache == 0) { 03245 zeroCounter += (drflac_uint32)DRFLAC_CACHE_L1_BITS_REMAINING(bs); 03246 if (!drflac__reload_cache(bs)) { 03247 return DRFLAC_FALSE; 03248 } 03249 } 03250 03251 setBitOffsetPlus1 = drflac__clz(bs->cache); 03252 zeroCounter += setBitOffsetPlus1; 03253 setBitOffsetPlus1 += 1; 03254 03255 riceLength = setBitOffsetPlus1 + riceParam; 03256 if (riceLength < DRFLAC_CACHE_L1_BITS_REMAINING(bs)) { 03257 riceParamPart = (drflac_uint32)((bs->cache & (riceParamMask >> setBitOffsetPlus1)) >> DRFLAC_CACHE_L1_SELECTION_SHIFT(bs, riceLength)); 03258 03259 bs->consumedBits += riceLength; 03260 bs->cache <<= riceLength; 03261 } else { 03262 drflac_uint32 bitCountLo; 03263 drflac_cache_t resultHi; 03264 03265 bs->consumedBits += riceLength; 03266 bs->cache <<= setBitOffsetPlus1 & (DRFLAC_CACHE_L1_SIZE_BITS(bs)-1); /* <-- Equivalent to "if (setBitOffsetPlus1 < DRFLAC_CACHE_L1_SIZE_BITS(bs)) { bs->cache <<= setBitOffsetPlus1; }" */ 03267 03268 /* It straddles the cached data. It will never cover more than the next chunk. We just read the number in two parts and combine them. */ 03269 bitCountLo = bs->consumedBits - DRFLAC_CACHE_L1_SIZE_BITS(bs); 03270 resultHi = DRFLAC_CACHE_L1_SELECT_AND_SHIFT(bs, riceParam); /* <-- Use DRFLAC_CACHE_L1_SELECT_AND_SHIFT_SAFE() if ever this function allows riceParam=0. */ 03271 03272 if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) { 03273 #ifndef DR_FLAC_NO_CRC 03274 drflac__update_crc16(bs); 03275 #endif 03276 bs->cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]); 03277 bs->consumedBits = 0; 03278 #ifndef DR_FLAC_NO_CRC 03279 bs->crc16Cache = bs->cache; 03280 #endif 03281 } else { 03282 /* Slow path. We need to fetch more data from the client. */ 03283 if (!drflac__reload_cache(bs)) { 03284 return DRFLAC_FALSE; 03285 } 03286 } 03287 03288 riceParamPart = (drflac_uint32)(resultHi | DRFLAC_CACHE_L1_SELECT_AND_SHIFT_SAFE(bs, bitCountLo)); 03289 03290 bs->consumedBits += bitCountLo; 03291 bs->cache <<= bitCountLo; 03292 } 03293 03294 pZeroCounterOut[0] = zeroCounter; 03295 pRiceParamPartOut[0] = riceParamPart; 03296 03297 return DRFLAC_TRUE; 03298 } 03299 #endif 03300 03301 static DRFLAC_INLINE drflac_bool32 drflac__read_rice_parts_x1(drflac_bs* bs, drflac_uint8 riceParam, drflac_uint32* pZeroCounterOut, drflac_uint32* pRiceParamPartOut) 03302 { 03303 drflac_uint32 riceParamPlus1 = riceParam + 1; 03304 /*drflac_cache_t riceParamPlus1Mask = DRFLAC_CACHE_L1_SELECTION_MASK(riceParamPlus1);*/ 03305 drflac_uint32 riceParamPlus1Shift = DRFLAC_CACHE_L1_SELECTION_SHIFT(bs, riceParamPlus1); 03306 drflac_uint32 riceParamPlus1MaxConsumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs) - riceParamPlus1; 03307 03308 /* 03309 The idea here is to use local variables for the cache in an attempt to encourage the compiler to store them in registers. I have 03310 no idea how this will work in practice... 03311 */ 03312 drflac_cache_t bs_cache = bs->cache; 03313 drflac_uint32 bs_consumedBits = bs->consumedBits; 03314 03315 /* The first thing to do is find the first unset bit. Most likely a bit will be set in the current cache line. */ 03316 drflac_uint32 lzcount = drflac__clz(bs_cache); 03317 if (lzcount < sizeof(bs_cache)*8) { 03318 pZeroCounterOut[0] = lzcount; 03319 03320 /* 03321 It is most likely that the riceParam part (which comes after the zero counter) is also on this cache line. When extracting 03322 this, we include the set bit from the unary coded part because it simplifies cache management. This bit will be handled 03323 outside of this function at a higher level. 03324 */ 03325 extract_rice_param_part: 03326 bs_cache <<= lzcount; 03327 bs_consumedBits += lzcount; 03328 03329 if (bs_consumedBits <= riceParamPlus1MaxConsumedBits) { 03330 /* Getting here means the rice parameter part is wholly contained within the current cache line. */ 03331 pRiceParamPartOut[0] = (drflac_uint32)(bs_cache >> riceParamPlus1Shift); 03332 bs_cache <<= riceParamPlus1; 03333 bs_consumedBits += riceParamPlus1; 03334 } else { 03335 drflac_uint32 riceParamPartHi; 03336 drflac_uint32 riceParamPartLo; 03337 drflac_uint32 riceParamPartLoBitCount; 03338 03339 /* 03340 Getting here means the rice parameter part straddles the cache line. We need to read from the tail of the current cache 03341 line, reload the cache, and then combine it with the head of the next cache line. 03342 */ 03343 03344 /* Grab the high part of the rice parameter part. */ 03345 riceParamPartHi = (drflac_uint32)(bs_cache >> riceParamPlus1Shift); 03346 03347 /* Before reloading the cache we need to grab the size in bits of the low part. */ 03348 riceParamPartLoBitCount = bs_consumedBits - riceParamPlus1MaxConsumedBits; 03349 DRFLAC_ASSERT(riceParamPartLoBitCount > 0 && riceParamPartLoBitCount < 32); 03350 03351 /* Now reload the cache. */ 03352 if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) { 03353 #ifndef DR_FLAC_NO_CRC 03354 drflac__update_crc16(bs); 03355 #endif 03356 bs_cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]); 03357 bs_consumedBits = riceParamPartLoBitCount; 03358 #ifndef DR_FLAC_NO_CRC 03359 bs->crc16Cache = bs_cache; 03360 #endif 03361 } else { 03362 /* Slow path. We need to fetch more data from the client. */ 03363 if (!drflac__reload_cache(bs)) { 03364 return DRFLAC_FALSE; 03365 } 03366 03367 bs_cache = bs->cache; 03368 bs_consumedBits = bs->consumedBits + riceParamPartLoBitCount; 03369 } 03370 03371 /* We should now have enough information to construct the rice parameter part. */ 03372 riceParamPartLo = (drflac_uint32)(bs_cache >> (DRFLAC_CACHE_L1_SELECTION_SHIFT(bs, riceParamPartLoBitCount))); 03373 pRiceParamPartOut[0] = riceParamPartHi | riceParamPartLo; 03374 03375 bs_cache <<= riceParamPartLoBitCount; 03376 } 03377 } else { 03378 /* 03379 Getting here means there are no bits set on the cache line. This is a less optimal case because we just wasted a call 03380 to drflac__clz() and we need to reload the cache. 03381 */ 03382 drflac_uint32 zeroCounter = (drflac_uint32)(DRFLAC_CACHE_L1_SIZE_BITS(bs) - bs_consumedBits); 03383 for (;;) { 03384 if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) { 03385 #ifndef DR_FLAC_NO_CRC 03386 drflac__update_crc16(bs); 03387 #endif 03388 bs_cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]); 03389 bs_consumedBits = 0; 03390 #ifndef DR_FLAC_NO_CRC 03391 bs->crc16Cache = bs_cache; 03392 #endif 03393 } else { 03394 /* Slow path. We need to fetch more data from the client. */ 03395 if (!drflac__reload_cache(bs)) { 03396 return DRFLAC_FALSE; 03397 } 03398 03399 bs_cache = bs->cache; 03400 bs_consumedBits = bs->consumedBits; 03401 } 03402 03403 lzcount = drflac__clz(bs_cache); 03404 zeroCounter += lzcount; 03405 03406 if (lzcount < sizeof(bs_cache)*8) { 03407 break; 03408 } 03409 } 03410 03411 pZeroCounterOut[0] = zeroCounter; 03412 goto extract_rice_param_part; 03413 } 03414 03415 /* Make sure the cache is restored at the end of it all. */ 03416 bs->cache = bs_cache; 03417 bs->consumedBits = bs_consumedBits; 03418 03419 return DRFLAC_TRUE; 03420 } 03421 03422 static DRFLAC_INLINE drflac_bool32 drflac__seek_rice_parts(drflac_bs* bs, drflac_uint8 riceParam) 03423 { 03424 drflac_uint32 riceParamPlus1 = riceParam + 1; 03425 drflac_uint32 riceParamPlus1MaxConsumedBits = DRFLAC_CACHE_L1_SIZE_BITS(bs) - riceParamPlus1; 03426 03427 /* 03428 The idea here is to use local variables for the cache in an attempt to encourage the compiler to store them in registers. I have 03429 no idea how this will work in practice... 03430 */ 03431 drflac_cache_t bs_cache = bs->cache; 03432 drflac_uint32 bs_consumedBits = bs->consumedBits; 03433 03434 /* The first thing to do is find the first unset bit. Most likely a bit will be set in the current cache line. */ 03435 drflac_uint32 lzcount = drflac__clz(bs_cache); 03436 if (lzcount < sizeof(bs_cache)*8) { 03437 /* 03438 It is most likely that the riceParam part (which comes after the zero counter) is also on this cache line. When extracting 03439 this, we include the set bit from the unary coded part because it simplifies cache management. This bit will be handled 03440 outside of this function at a higher level. 03441 */ 03442 extract_rice_param_part: 03443 bs_cache <<= lzcount; 03444 bs_consumedBits += lzcount; 03445 03446 if (bs_consumedBits <= riceParamPlus1MaxConsumedBits) { 03447 /* Getting here means the rice parameter part is wholly contained within the current cache line. */ 03448 bs_cache <<= riceParamPlus1; 03449 bs_consumedBits += riceParamPlus1; 03450 } else { 03451 /* 03452 Getting here means the rice parameter part straddles the cache line. We need to read from the tail of the current cache 03453 line, reload the cache, and then combine it with the head of the next cache line. 03454 */ 03455 03456 /* Before reloading the cache we need to grab the size in bits of the low part. */ 03457 drflac_uint32 riceParamPartLoBitCount = bs_consumedBits - riceParamPlus1MaxConsumedBits; 03458 DRFLAC_ASSERT(riceParamPartLoBitCount > 0 && riceParamPartLoBitCount < 32); 03459 03460 /* Now reload the cache. */ 03461 if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) { 03462 #ifndef DR_FLAC_NO_CRC 03463 drflac__update_crc16(bs); 03464 #endif 03465 bs_cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]); 03466 bs_consumedBits = riceParamPartLoBitCount; 03467 #ifndef DR_FLAC_NO_CRC 03468 bs->crc16Cache = bs_cache; 03469 #endif 03470 } else { 03471 /* Slow path. We need to fetch more data from the client. */ 03472 if (!drflac__reload_cache(bs)) { 03473 return DRFLAC_FALSE; 03474 } 03475 03476 bs_cache = bs->cache; 03477 bs_consumedBits = bs->consumedBits + riceParamPartLoBitCount; 03478 } 03479 03480 bs_cache <<= riceParamPartLoBitCount; 03481 } 03482 } else { 03483 /* 03484 Getting here means there are no bits set on the cache line. This is a less optimal case because we just wasted a call 03485 to drflac__clz() and we need to reload the cache. 03486 */ 03487 for (;;) { 03488 if (bs->nextL2Line < DRFLAC_CACHE_L2_LINE_COUNT(bs)) { 03489 #ifndef DR_FLAC_NO_CRC 03490 drflac__update_crc16(bs); 03491 #endif 03492 bs_cache = drflac__be2host__cache_line(bs->cacheL2[bs->nextL2Line++]); 03493 bs_consumedBits = 0; 03494 #ifndef DR_FLAC_NO_CRC 03495 bs->crc16Cache = bs_cache; 03496 #endif 03497 } else { 03498 /* Slow path. We need to fetch more data from the client. */ 03499 if (!drflac__reload_cache(bs)) { 03500 return DRFLAC_FALSE; 03501 } 03502 03503 bs_cache = bs->cache; 03504 bs_consumedBits = bs->consumedBits; 03505 } 03506 03507 lzcount = drflac__clz(bs_cache); 03508 if (lzcount < sizeof(bs_cache)*8) { 03509 break; 03510 } 03511 } 03512 03513 goto extract_rice_param_part; 03514 } 03515 03516 /* Make sure the cache is restored at the end of it all. */ 03517 bs->cache = bs_cache; 03518 bs->consumedBits = bs_consumedBits; 03519 03520 return DRFLAC_TRUE; 03521 } 03522 03523 03524 static drflac_bool32 drflac__decode_samples_with_residual__rice__scalar_zeroorder(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 03525 { 03526 drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF}; 03527 drflac_uint32 zeroCountPart0; 03528 drflac_uint32 riceParamPart0; 03529 drflac_uint32 riceParamMask; 03530 drflac_uint32 i; 03531 03532 DRFLAC_ASSERT(bs != NULL); 03533 DRFLAC_ASSERT(count > 0); 03534 DRFLAC_ASSERT(pSamplesOut != NULL); 03535 03536 (void)bitsPerSample; 03537 (void)order; 03538 (void)shift; 03539 (void)coefficients; 03540 03541 riceParamMask = (drflac_uint32)~((~0UL) << riceParam); 03542 03543 i = 0; 03544 while (i < count) { 03545 /* Rice extraction. */ 03546 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart0, &riceParamPart0)) { 03547 return DRFLAC_FALSE; 03548 } 03549 03550 /* Rice reconstruction. */ 03551 riceParamPart0 &= riceParamMask; 03552 riceParamPart0 |= (zeroCountPart0 << riceParam); 03553 riceParamPart0 = (riceParamPart0 >> 1) ^ t[riceParamPart0 & 0x01]; 03554 03555 pSamplesOut[i] = riceParamPart0; 03556 03557 i += 1; 03558 } 03559 03560 return DRFLAC_TRUE; 03561 } 03562 03563 static drflac_bool32 drflac__decode_samples_with_residual__rice__scalar(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 03564 { 03565 drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF}; 03566 drflac_uint32 zeroCountPart0 = 0; 03567 drflac_uint32 zeroCountPart1 = 0; 03568 drflac_uint32 zeroCountPart2 = 0; 03569 drflac_uint32 zeroCountPart3 = 0; 03570 drflac_uint32 riceParamPart0 = 0; 03571 drflac_uint32 riceParamPart1 = 0; 03572 drflac_uint32 riceParamPart2 = 0; 03573 drflac_uint32 riceParamPart3 = 0; 03574 drflac_uint32 riceParamMask; 03575 const drflac_int32* pSamplesOutEnd; 03576 drflac_uint32 i; 03577 03578 DRFLAC_ASSERT(bs != NULL); 03579 DRFLAC_ASSERT(count > 0); 03580 DRFLAC_ASSERT(pSamplesOut != NULL); 03581 03582 if (order == 0) { 03583 return drflac__decode_samples_with_residual__rice__scalar_zeroorder(bs, bitsPerSample, count, riceParam, order, shift, coefficients, pSamplesOut); 03584 } 03585 03586 riceParamMask = (drflac_uint32)~((~0UL) << riceParam); 03587 pSamplesOutEnd = pSamplesOut + (count & ~3); 03588 03589 if (bitsPerSample+shift > 32) { 03590 while (pSamplesOut < pSamplesOutEnd) { 03591 /* 03592 Rice extraction. It's faster to do this one at a time against local variables than it is to use the x4 version 03593 against an array. Not sure why, but perhaps it's making more efficient use of registers? 03594 */ 03595 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart0, &riceParamPart0) || 03596 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart1, &riceParamPart1) || 03597 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart2, &riceParamPart2) || 03598 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart3, &riceParamPart3)) { 03599 return DRFLAC_FALSE; 03600 } 03601 03602 riceParamPart0 &= riceParamMask; 03603 riceParamPart1 &= riceParamMask; 03604 riceParamPart2 &= riceParamMask; 03605 riceParamPart3 &= riceParamMask; 03606 03607 riceParamPart0 |= (zeroCountPart0 << riceParam); 03608 riceParamPart1 |= (zeroCountPart1 << riceParam); 03609 riceParamPart2 |= (zeroCountPart2 << riceParam); 03610 riceParamPart3 |= (zeroCountPart3 << riceParam); 03611 03612 riceParamPart0 = (riceParamPart0 >> 1) ^ t[riceParamPart0 & 0x01]; 03613 riceParamPart1 = (riceParamPart1 >> 1) ^ t[riceParamPart1 & 0x01]; 03614 riceParamPart2 = (riceParamPart2 >> 1) ^ t[riceParamPart2 & 0x01]; 03615 riceParamPart3 = (riceParamPart3 >> 1) ^ t[riceParamPart3 & 0x01]; 03616 03617 pSamplesOut[0] = riceParamPart0 + drflac__calculate_prediction_64(order, shift, coefficients, pSamplesOut + 0); 03618 pSamplesOut[1] = riceParamPart1 + drflac__calculate_prediction_64(order, shift, coefficients, pSamplesOut + 1); 03619 pSamplesOut[2] = riceParamPart2 + drflac__calculate_prediction_64(order, shift, coefficients, pSamplesOut + 2); 03620 pSamplesOut[3] = riceParamPart3 + drflac__calculate_prediction_64(order, shift, coefficients, pSamplesOut + 3); 03621 03622 pSamplesOut += 4; 03623 } 03624 } else { 03625 while (pSamplesOut < pSamplesOutEnd) { 03626 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart0, &riceParamPart0) || 03627 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart1, &riceParamPart1) || 03628 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart2, &riceParamPart2) || 03629 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart3, &riceParamPart3)) { 03630 return DRFLAC_FALSE; 03631 } 03632 03633 riceParamPart0 &= riceParamMask; 03634 riceParamPart1 &= riceParamMask; 03635 riceParamPart2 &= riceParamMask; 03636 riceParamPart3 &= riceParamMask; 03637 03638 riceParamPart0 |= (zeroCountPart0 << riceParam); 03639 riceParamPart1 |= (zeroCountPart1 << riceParam); 03640 riceParamPart2 |= (zeroCountPart2 << riceParam); 03641 riceParamPart3 |= (zeroCountPart3 << riceParam); 03642 03643 riceParamPart0 = (riceParamPart0 >> 1) ^ t[riceParamPart0 & 0x01]; 03644 riceParamPart1 = (riceParamPart1 >> 1) ^ t[riceParamPart1 & 0x01]; 03645 riceParamPart2 = (riceParamPart2 >> 1) ^ t[riceParamPart2 & 0x01]; 03646 riceParamPart3 = (riceParamPart3 >> 1) ^ t[riceParamPart3 & 0x01]; 03647 03648 pSamplesOut[0] = riceParamPart0 + drflac__calculate_prediction_32(order, shift, coefficients, pSamplesOut + 0); 03649 pSamplesOut[1] = riceParamPart1 + drflac__calculate_prediction_32(order, shift, coefficients, pSamplesOut + 1); 03650 pSamplesOut[2] = riceParamPart2 + drflac__calculate_prediction_32(order, shift, coefficients, pSamplesOut + 2); 03651 pSamplesOut[3] = riceParamPart3 + drflac__calculate_prediction_32(order, shift, coefficients, pSamplesOut + 3); 03652 03653 pSamplesOut += 4; 03654 } 03655 } 03656 03657 i = (count & ~3); 03658 while (i < count) { 03659 /* Rice extraction. */ 03660 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountPart0, &riceParamPart0)) { 03661 return DRFLAC_FALSE; 03662 } 03663 03664 /* Rice reconstruction. */ 03665 riceParamPart0 &= riceParamMask; 03666 riceParamPart0 |= (zeroCountPart0 << riceParam); 03667 riceParamPart0 = (riceParamPart0 >> 1) ^ t[riceParamPart0 & 0x01]; 03668 /*riceParamPart0 = (riceParamPart0 >> 1) ^ (~(riceParamPart0 & 0x01) + 1);*/ 03669 03670 /* Sample reconstruction. */ 03671 if (bitsPerSample+shift > 32) { 03672 pSamplesOut[0] = riceParamPart0 + drflac__calculate_prediction_64(order, shift, coefficients, pSamplesOut + 0); 03673 } else { 03674 pSamplesOut[0] = riceParamPart0 + drflac__calculate_prediction_32(order, shift, coefficients, pSamplesOut + 0); 03675 } 03676 03677 i += 1; 03678 pSamplesOut += 1; 03679 } 03680 03681 return DRFLAC_TRUE; 03682 } 03683 03684 #if defined(DRFLAC_SUPPORT_SSE2) 03685 static DRFLAC_INLINE __m128i drflac__mm_packs_interleaved_epi32(__m128i a, __m128i b) 03686 { 03687 __m128i r; 03688 03689 /* Pack. */ 03690 r = _mm_packs_epi32(a, b); 03691 03692 /* a3a2 a1a0 b3b2 b1b0 -> a3a2 b3b2 a1a0 b1b0 */ 03693 r = _mm_shuffle_epi32(r, _MM_SHUFFLE(3, 1, 2, 0)); 03694 03695 /* a3a2 b3b2 a1a0 b1b0 -> a3b3 a2b2 a1b1 a0b0 */ 03696 r = _mm_shufflehi_epi16(r, _MM_SHUFFLE(3, 1, 2, 0)); 03697 r = _mm_shufflelo_epi16(r, _MM_SHUFFLE(3, 1, 2, 0)); 03698 03699 return r; 03700 } 03701 #endif 03702 03703 #if defined(DRFLAC_SUPPORT_SSE41) 03704 static DRFLAC_INLINE __m128i drflac__mm_not_si128(__m128i a) 03705 { 03706 return _mm_xor_si128(a, _mm_cmpeq_epi32(_mm_setzero_si128(), _mm_setzero_si128())); 03707 } 03708 03709 static DRFLAC_INLINE __m128i drflac__mm_hadd_epi32(__m128i x) 03710 { 03711 __m128i x64 = _mm_add_epi32(x, _mm_shuffle_epi32(x, _MM_SHUFFLE(1, 0, 3, 2))); 03712 __m128i x32 = _mm_shufflelo_epi16(x64, _MM_SHUFFLE(1, 0, 3, 2)); 03713 return _mm_add_epi32(x64, x32); 03714 } 03715 03716 static DRFLAC_INLINE __m128i drflac__mm_hadd_epi64(__m128i x) 03717 { 03718 return _mm_add_epi64(x, _mm_shuffle_epi32(x, _MM_SHUFFLE(1, 0, 3, 2))); 03719 } 03720 03721 static DRFLAC_INLINE __m128i drflac__mm_srai_epi64(__m128i x, int count) 03722 { 03723 /* 03724 To simplify this we are assuming count < 32. This restriction allows us to work on a low side and a high side. The low side 03725 is shifted with zero bits, whereas the right side is shifted with sign bits. 03726 */ 03727 __m128i lo = _mm_srli_epi64(x, count); 03728 __m128i hi = _mm_srai_epi32(x, count); 03729 03730 hi = _mm_and_si128(hi, _mm_set_epi32(0xFFFFFFFF, 0, 0xFFFFFFFF, 0)); /* The high part needs to have the low part cleared. */ 03731 03732 return _mm_or_si128(lo, hi); 03733 } 03734 03735 static drflac_bool32 drflac__decode_samples_with_residual__rice__sse41_32(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 03736 { 03737 int i; 03738 drflac_uint32 riceParamMask; 03739 drflac_int32* pDecodedSamples = pSamplesOut; 03740 drflac_int32* pDecodedSamplesEnd = pSamplesOut + (count & ~3); 03741 drflac_uint32 zeroCountParts0 = 0; 03742 drflac_uint32 zeroCountParts1 = 0; 03743 drflac_uint32 zeroCountParts2 = 0; 03744 drflac_uint32 zeroCountParts3 = 0; 03745 drflac_uint32 riceParamParts0 = 0; 03746 drflac_uint32 riceParamParts1 = 0; 03747 drflac_uint32 riceParamParts2 = 0; 03748 drflac_uint32 riceParamParts3 = 0; 03749 __m128i coefficients128_0; 03750 __m128i coefficients128_4; 03751 __m128i coefficients128_8; 03752 __m128i samples128_0; 03753 __m128i samples128_4; 03754 __m128i samples128_8; 03755 __m128i riceParamMask128; 03756 03757 const drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF}; 03758 03759 riceParamMask = (drflac_uint32)~((~0UL) << riceParam); 03760 riceParamMask128 = _mm_set1_epi32(riceParamMask); 03761 03762 /* Pre-load. */ 03763 coefficients128_0 = _mm_setzero_si128(); 03764 coefficients128_4 = _mm_setzero_si128(); 03765 coefficients128_8 = _mm_setzero_si128(); 03766 03767 samples128_0 = _mm_setzero_si128(); 03768 samples128_4 = _mm_setzero_si128(); 03769 samples128_8 = _mm_setzero_si128(); 03770 03771 /* 03772 Pre-loading the coefficients and prior samples is annoying because we need to ensure we don't try reading more than 03773 what's available in the input buffers. It would be convenient to use a fall-through switch to do this, but this results 03774 in strict aliasing warnings with GCC. To work around this I'm just doing something hacky. This feels a bit convoluted 03775 so I think there's opportunity for this to be simplified. 03776 */ 03777 #if 1 03778 { 03779 int runningOrder = order; 03780 03781 /* 0 - 3. */ 03782 if (runningOrder >= 4) { 03783 coefficients128_0 = _mm_loadu_si128((const __m128i*)(coefficients + 0)); 03784 samples128_0 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 4)); 03785 runningOrder -= 4; 03786 } else { 03787 switch (runningOrder) { 03788 case 3: coefficients128_0 = _mm_set_epi32(0, coefficients[2], coefficients[1], coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], pSamplesOut[-2], pSamplesOut[-3], 0); break; 03789 case 2: coefficients128_0 = _mm_set_epi32(0, 0, coefficients[1], coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], pSamplesOut[-2], 0, 0); break; 03790 case 1: coefficients128_0 = _mm_set_epi32(0, 0, 0, coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], 0, 0, 0); break; 03791 } 03792 runningOrder = 0; 03793 } 03794 03795 /* 4 - 7 */ 03796 if (runningOrder >= 4) { 03797 coefficients128_4 = _mm_loadu_si128((const __m128i*)(coefficients + 4)); 03798 samples128_4 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 8)); 03799 runningOrder -= 4; 03800 } else { 03801 switch (runningOrder) { 03802 case 3: coefficients128_4 = _mm_set_epi32(0, coefficients[6], coefficients[5], coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], pSamplesOut[-6], pSamplesOut[-7], 0); break; 03803 case 2: coefficients128_4 = _mm_set_epi32(0, 0, coefficients[5], coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], pSamplesOut[-6], 0, 0); break; 03804 case 1: coefficients128_4 = _mm_set_epi32(0, 0, 0, coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], 0, 0, 0); break; 03805 } 03806 runningOrder = 0; 03807 } 03808 03809 /* 8 - 11 */ 03810 if (runningOrder == 4) { 03811 coefficients128_8 = _mm_loadu_si128((const __m128i*)(coefficients + 8)); 03812 samples128_8 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 12)); 03813 runningOrder -= 4; 03814 } else { 03815 switch (runningOrder) { 03816 case 3: coefficients128_8 = _mm_set_epi32(0, coefficients[10], coefficients[9], coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], pSamplesOut[-10], pSamplesOut[-11], 0); break; 03817 case 2: coefficients128_8 = _mm_set_epi32(0, 0, coefficients[9], coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], pSamplesOut[-10], 0, 0); break; 03818 case 1: coefficients128_8 = _mm_set_epi32(0, 0, 0, coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], 0, 0, 0); break; 03819 } 03820 runningOrder = 0; 03821 } 03822 03823 /* Coefficients need to be shuffled for our streaming algorithm below to work. Samples are already in the correct order from the loading routine above. */ 03824 coefficients128_0 = _mm_shuffle_epi32(coefficients128_0, _MM_SHUFFLE(0, 1, 2, 3)); 03825 coefficients128_4 = _mm_shuffle_epi32(coefficients128_4, _MM_SHUFFLE(0, 1, 2, 3)); 03826 coefficients128_8 = _mm_shuffle_epi32(coefficients128_8, _MM_SHUFFLE(0, 1, 2, 3)); 03827 } 03828 #else 03829 /* This causes strict-aliasing warnings with GCC. */ 03830 switch (order) 03831 { 03832 case 12: ((drflac_int32*)&coefficients128_8)[0] = coefficients[11]; ((drflac_int32*)&samples128_8)[0] = pDecodedSamples[-12]; 03833 case 11: ((drflac_int32*)&coefficients128_8)[1] = coefficients[10]; ((drflac_int32*)&samples128_8)[1] = pDecodedSamples[-11]; 03834 case 10: ((drflac_int32*)&coefficients128_8)[2] = coefficients[ 9]; ((drflac_int32*)&samples128_8)[2] = pDecodedSamples[-10]; 03835 case 9: ((drflac_int32*)&coefficients128_8)[3] = coefficients[ 8]; ((drflac_int32*)&samples128_8)[3] = pDecodedSamples[- 9]; 03836 case 8: ((drflac_int32*)&coefficients128_4)[0] = coefficients[ 7]; ((drflac_int32*)&samples128_4)[0] = pDecodedSamples[- 8]; 03837 case 7: ((drflac_int32*)&coefficients128_4)[1] = coefficients[ 6]; ((drflac_int32*)&samples128_4)[1] = pDecodedSamples[- 7]; 03838 case 6: ((drflac_int32*)&coefficients128_4)[2] = coefficients[ 5]; ((drflac_int32*)&samples128_4)[2] = pDecodedSamples[- 6]; 03839 case 5: ((drflac_int32*)&coefficients128_4)[3] = coefficients[ 4]; ((drflac_int32*)&samples128_4)[3] = pDecodedSamples[- 5]; 03840 case 4: ((drflac_int32*)&coefficients128_0)[0] = coefficients[ 3]; ((drflac_int32*)&samples128_0)[0] = pDecodedSamples[- 4]; 03841 case 3: ((drflac_int32*)&coefficients128_0)[1] = coefficients[ 2]; ((drflac_int32*)&samples128_0)[1] = pDecodedSamples[- 3]; 03842 case 2: ((drflac_int32*)&coefficients128_0)[2] = coefficients[ 1]; ((drflac_int32*)&samples128_0)[2] = pDecodedSamples[- 2]; 03843 case 1: ((drflac_int32*)&coefficients128_0)[3] = coefficients[ 0]; ((drflac_int32*)&samples128_0)[3] = pDecodedSamples[- 1]; 03844 } 03845 #endif 03846 03847 /* For this version we are doing one sample at a time. */ 03848 while (pDecodedSamples < pDecodedSamplesEnd) { 03849 __m128i prediction128; 03850 __m128i zeroCountPart128; 03851 __m128i riceParamPart128; 03852 03853 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts0, &riceParamParts0) || 03854 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts1, &riceParamParts1) || 03855 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts2, &riceParamParts2) || 03856 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts3, &riceParamParts3)) { 03857 return DRFLAC_FALSE; 03858 } 03859 03860 zeroCountPart128 = _mm_set_epi32(zeroCountParts3, zeroCountParts2, zeroCountParts1, zeroCountParts0); 03861 riceParamPart128 = _mm_set_epi32(riceParamParts3, riceParamParts2, riceParamParts1, riceParamParts0); 03862 03863 riceParamPart128 = _mm_and_si128(riceParamPart128, riceParamMask128); 03864 riceParamPart128 = _mm_or_si128(riceParamPart128, _mm_slli_epi32(zeroCountPart128, riceParam)); 03865 riceParamPart128 = _mm_xor_si128(_mm_srli_epi32(riceParamPart128, 1), _mm_add_epi32(drflac__mm_not_si128(_mm_and_si128(riceParamPart128, _mm_set1_epi32(0x01))), _mm_set1_epi32(0x01))); /* <-- SSE2 compatible */ 03866 /*riceParamPart128 = _mm_xor_si128(_mm_srli_epi32(riceParamPart128, 1), _mm_mullo_epi32(_mm_and_si128(riceParamPart128, _mm_set1_epi32(0x01)), _mm_set1_epi32(0xFFFFFFFF)));*/ /* <-- Only supported from SSE4.1 and is slower in my testing... */ 03867 03868 if (order <= 4) { 03869 for (i = 0; i < 4; i += 1) { 03870 prediction128 = _mm_mullo_epi32(coefficients128_0, samples128_0); 03871 03872 /* Horizontal add and shift. */ 03873 prediction128 = drflac__mm_hadd_epi32(prediction128); 03874 prediction128 = _mm_srai_epi32(prediction128, shift); 03875 prediction128 = _mm_add_epi32(riceParamPart128, prediction128); 03876 03877 samples128_0 = _mm_alignr_epi8(prediction128, samples128_0, 4); 03878 riceParamPart128 = _mm_alignr_epi8(_mm_setzero_si128(), riceParamPart128, 4); 03879 } 03880 } else if (order <= 8) { 03881 for (i = 0; i < 4; i += 1) { 03882 prediction128 = _mm_mullo_epi32(coefficients128_4, samples128_4); 03883 prediction128 = _mm_add_epi32(prediction128, _mm_mullo_epi32(coefficients128_0, samples128_0)); 03884 03885 /* Horizontal add and shift. */ 03886 prediction128 = drflac__mm_hadd_epi32(prediction128); 03887 prediction128 = _mm_srai_epi32(prediction128, shift); 03888 prediction128 = _mm_add_epi32(riceParamPart128, prediction128); 03889 03890 samples128_4 = _mm_alignr_epi8(samples128_0, samples128_4, 4); 03891 samples128_0 = _mm_alignr_epi8(prediction128, samples128_0, 4); 03892 riceParamPart128 = _mm_alignr_epi8(_mm_setzero_si128(), riceParamPart128, 4); 03893 } 03894 } else { 03895 for (i = 0; i < 4; i += 1) { 03896 prediction128 = _mm_mullo_epi32(coefficients128_8, samples128_8); 03897 prediction128 = _mm_add_epi32(prediction128, _mm_mullo_epi32(coefficients128_4, samples128_4)); 03898 prediction128 = _mm_add_epi32(prediction128, _mm_mullo_epi32(coefficients128_0, samples128_0)); 03899 03900 /* Horizontal add and shift. */ 03901 prediction128 = drflac__mm_hadd_epi32(prediction128); 03902 prediction128 = _mm_srai_epi32(prediction128, shift); 03903 prediction128 = _mm_add_epi32(riceParamPart128, prediction128); 03904 03905 samples128_8 = _mm_alignr_epi8(samples128_4, samples128_8, 4); 03906 samples128_4 = _mm_alignr_epi8(samples128_0, samples128_4, 4); 03907 samples128_0 = _mm_alignr_epi8(prediction128, samples128_0, 4); 03908 riceParamPart128 = _mm_alignr_epi8(_mm_setzero_si128(), riceParamPart128, 4); 03909 } 03910 } 03911 03912 /* We store samples in groups of 4. */ 03913 _mm_storeu_si128((__m128i*)pDecodedSamples, samples128_0); 03914 pDecodedSamples += 4; 03915 } 03916 03917 /* Make sure we process the last few samples. */ 03918 i = (count & ~3); 03919 while (i < (int)count) { 03920 /* Rice extraction. */ 03921 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts0, &riceParamParts0)) { 03922 return DRFLAC_FALSE; 03923 } 03924 03925 /* Rice reconstruction. */ 03926 riceParamParts0 &= riceParamMask; 03927 riceParamParts0 |= (zeroCountParts0 << riceParam); 03928 riceParamParts0 = (riceParamParts0 >> 1) ^ t[riceParamParts0 & 0x01]; 03929 03930 /* Sample reconstruction. */ 03931 pDecodedSamples[0] = riceParamParts0 + drflac__calculate_prediction_32(order, shift, coefficients, pDecodedSamples); 03932 03933 i += 1; 03934 pDecodedSamples += 1; 03935 } 03936 03937 return DRFLAC_TRUE; 03938 } 03939 03940 static drflac_bool32 drflac__decode_samples_with_residual__rice__sse41_64(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 03941 { 03942 int i; 03943 drflac_uint32 riceParamMask; 03944 drflac_int32* pDecodedSamples = pSamplesOut; 03945 drflac_int32* pDecodedSamplesEnd = pSamplesOut + (count & ~3); 03946 drflac_uint32 zeroCountParts0 = 0; 03947 drflac_uint32 zeroCountParts1 = 0; 03948 drflac_uint32 zeroCountParts2 = 0; 03949 drflac_uint32 zeroCountParts3 = 0; 03950 drflac_uint32 riceParamParts0 = 0; 03951 drflac_uint32 riceParamParts1 = 0; 03952 drflac_uint32 riceParamParts2 = 0; 03953 drflac_uint32 riceParamParts3 = 0; 03954 __m128i coefficients128_0; 03955 __m128i coefficients128_4; 03956 __m128i coefficients128_8; 03957 __m128i samples128_0; 03958 __m128i samples128_4; 03959 __m128i samples128_8; 03960 __m128i prediction128; 03961 __m128i riceParamMask128; 03962 03963 const drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF}; 03964 03965 DRFLAC_ASSERT(order <= 12); 03966 03967 riceParamMask = (drflac_uint32)~((~0UL) << riceParam); 03968 riceParamMask128 = _mm_set1_epi32(riceParamMask); 03969 03970 prediction128 = _mm_setzero_si128(); 03971 03972 /* Pre-load. */ 03973 coefficients128_0 = _mm_setzero_si128(); 03974 coefficients128_4 = _mm_setzero_si128(); 03975 coefficients128_8 = _mm_setzero_si128(); 03976 03977 samples128_0 = _mm_setzero_si128(); 03978 samples128_4 = _mm_setzero_si128(); 03979 samples128_8 = _mm_setzero_si128(); 03980 03981 #if 1 03982 { 03983 int runningOrder = order; 03984 03985 /* 0 - 3. */ 03986 if (runningOrder >= 4) { 03987 coefficients128_0 = _mm_loadu_si128((const __m128i*)(coefficients + 0)); 03988 samples128_0 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 4)); 03989 runningOrder -= 4; 03990 } else { 03991 switch (runningOrder) { 03992 case 3: coefficients128_0 = _mm_set_epi32(0, coefficients[2], coefficients[1], coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], pSamplesOut[-2], pSamplesOut[-3], 0); break; 03993 case 2: coefficients128_0 = _mm_set_epi32(0, 0, coefficients[1], coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], pSamplesOut[-2], 0, 0); break; 03994 case 1: coefficients128_0 = _mm_set_epi32(0, 0, 0, coefficients[0]); samples128_0 = _mm_set_epi32(pSamplesOut[-1], 0, 0, 0); break; 03995 } 03996 runningOrder = 0; 03997 } 03998 03999 /* 4 - 7 */ 04000 if (runningOrder >= 4) { 04001 coefficients128_4 = _mm_loadu_si128((const __m128i*)(coefficients + 4)); 04002 samples128_4 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 8)); 04003 runningOrder -= 4; 04004 } else { 04005 switch (runningOrder) { 04006 case 3: coefficients128_4 = _mm_set_epi32(0, coefficients[6], coefficients[5], coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], pSamplesOut[-6], pSamplesOut[-7], 0); break; 04007 case 2: coefficients128_4 = _mm_set_epi32(0, 0, coefficients[5], coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], pSamplesOut[-6], 0, 0); break; 04008 case 1: coefficients128_4 = _mm_set_epi32(0, 0, 0, coefficients[4]); samples128_4 = _mm_set_epi32(pSamplesOut[-5], 0, 0, 0); break; 04009 } 04010 runningOrder = 0; 04011 } 04012 04013 /* 8 - 11 */ 04014 if (runningOrder == 4) { 04015 coefficients128_8 = _mm_loadu_si128((const __m128i*)(coefficients + 8)); 04016 samples128_8 = _mm_loadu_si128((const __m128i*)(pSamplesOut - 12)); 04017 runningOrder -= 4; 04018 } else { 04019 switch (runningOrder) { 04020 case 3: coefficients128_8 = _mm_set_epi32(0, coefficients[10], coefficients[9], coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], pSamplesOut[-10], pSamplesOut[-11], 0); break; 04021 case 2: coefficients128_8 = _mm_set_epi32(0, 0, coefficients[9], coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], pSamplesOut[-10], 0, 0); break; 04022 case 1: coefficients128_8 = _mm_set_epi32(0, 0, 0, coefficients[8]); samples128_8 = _mm_set_epi32(pSamplesOut[-9], 0, 0, 0); break; 04023 } 04024 runningOrder = 0; 04025 } 04026 04027 /* Coefficients need to be shuffled for our streaming algorithm below to work. Samples are already in the correct order from the loading routine above. */ 04028 coefficients128_0 = _mm_shuffle_epi32(coefficients128_0, _MM_SHUFFLE(0, 1, 2, 3)); 04029 coefficients128_4 = _mm_shuffle_epi32(coefficients128_4, _MM_SHUFFLE(0, 1, 2, 3)); 04030 coefficients128_8 = _mm_shuffle_epi32(coefficients128_8, _MM_SHUFFLE(0, 1, 2, 3)); 04031 } 04032 #else 04033 switch (order) 04034 { 04035 case 12: ((drflac_int32*)&coefficients128_8)[0] = coefficients[11]; ((drflac_int32*)&samples128_8)[0] = pDecodedSamples[-12]; 04036 case 11: ((drflac_int32*)&coefficients128_8)[1] = coefficients[10]; ((drflac_int32*)&samples128_8)[1] = pDecodedSamples[-11]; 04037 case 10: ((drflac_int32*)&coefficients128_8)[2] = coefficients[ 9]; ((drflac_int32*)&samples128_8)[2] = pDecodedSamples[-10]; 04038 case 9: ((drflac_int32*)&coefficients128_8)[3] = coefficients[ 8]; ((drflac_int32*)&samples128_8)[3] = pDecodedSamples[- 9]; 04039 case 8: ((drflac_int32*)&coefficients128_4)[0] = coefficients[ 7]; ((drflac_int32*)&samples128_4)[0] = pDecodedSamples[- 8]; 04040 case 7: ((drflac_int32*)&coefficients128_4)[1] = coefficients[ 6]; ((drflac_int32*)&samples128_4)[1] = pDecodedSamples[- 7]; 04041 case 6: ((drflac_int32*)&coefficients128_4)[2] = coefficients[ 5]; ((drflac_int32*)&samples128_4)[2] = pDecodedSamples[- 6]; 04042 case 5: ((drflac_int32*)&coefficients128_4)[3] = coefficients[ 4]; ((drflac_int32*)&samples128_4)[3] = pDecodedSamples[- 5]; 04043 case 4: ((drflac_int32*)&coefficients128_0)[0] = coefficients[ 3]; ((drflac_int32*)&samples128_0)[0] = pDecodedSamples[- 4]; 04044 case 3: ((drflac_int32*)&coefficients128_0)[1] = coefficients[ 2]; ((drflac_int32*)&samples128_0)[1] = pDecodedSamples[- 3]; 04045 case 2: ((drflac_int32*)&coefficients128_0)[2] = coefficients[ 1]; ((drflac_int32*)&samples128_0)[2] = pDecodedSamples[- 2]; 04046 case 1: ((drflac_int32*)&coefficients128_0)[3] = coefficients[ 0]; ((drflac_int32*)&samples128_0)[3] = pDecodedSamples[- 1]; 04047 } 04048 #endif 04049 04050 /* For this version we are doing one sample at a time. */ 04051 while (pDecodedSamples < pDecodedSamplesEnd) { 04052 __m128i zeroCountPart128; 04053 __m128i riceParamPart128; 04054 04055 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts0, &riceParamParts0) || 04056 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts1, &riceParamParts1) || 04057 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts2, &riceParamParts2) || 04058 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts3, &riceParamParts3)) { 04059 return DRFLAC_FALSE; 04060 } 04061 04062 zeroCountPart128 = _mm_set_epi32(zeroCountParts3, zeroCountParts2, zeroCountParts1, zeroCountParts0); 04063 riceParamPart128 = _mm_set_epi32(riceParamParts3, riceParamParts2, riceParamParts1, riceParamParts0); 04064 04065 riceParamPart128 = _mm_and_si128(riceParamPart128, riceParamMask128); 04066 riceParamPart128 = _mm_or_si128(riceParamPart128, _mm_slli_epi32(zeroCountPart128, riceParam)); 04067 riceParamPart128 = _mm_xor_si128(_mm_srli_epi32(riceParamPart128, 1), _mm_add_epi32(drflac__mm_not_si128(_mm_and_si128(riceParamPart128, _mm_set1_epi32(1))), _mm_set1_epi32(1))); 04068 04069 for (i = 0; i < 4; i += 1) { 04070 prediction128 = _mm_xor_si128(prediction128, prediction128); /* Reset to 0. */ 04071 04072 switch (order) 04073 { 04074 case 12: 04075 case 11: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_8, _MM_SHUFFLE(1, 1, 0, 0)), _mm_shuffle_epi32(samples128_8, _MM_SHUFFLE(1, 1, 0, 0)))); 04076 case 10: 04077 case 9: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_8, _MM_SHUFFLE(3, 3, 2, 2)), _mm_shuffle_epi32(samples128_8, _MM_SHUFFLE(3, 3, 2, 2)))); 04078 case 8: 04079 case 7: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_4, _MM_SHUFFLE(1, 1, 0, 0)), _mm_shuffle_epi32(samples128_4, _MM_SHUFFLE(1, 1, 0, 0)))); 04080 case 6: 04081 case 5: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_4, _MM_SHUFFLE(3, 3, 2, 2)), _mm_shuffle_epi32(samples128_4, _MM_SHUFFLE(3, 3, 2, 2)))); 04082 case 4: 04083 case 3: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_0, _MM_SHUFFLE(1, 1, 0, 0)), _mm_shuffle_epi32(samples128_0, _MM_SHUFFLE(1, 1, 0, 0)))); 04084 case 2: 04085 case 1: prediction128 = _mm_add_epi64(prediction128, _mm_mul_epi32(_mm_shuffle_epi32(coefficients128_0, _MM_SHUFFLE(3, 3, 2, 2)), _mm_shuffle_epi32(samples128_0, _MM_SHUFFLE(3, 3, 2, 2)))); 04086 } 04087 04088 /* Horizontal add and shift. */ 04089 prediction128 = drflac__mm_hadd_epi64(prediction128); 04090 prediction128 = drflac__mm_srai_epi64(prediction128, shift); 04091 prediction128 = _mm_add_epi32(riceParamPart128, prediction128); 04092 04093 /* Our value should be sitting in prediction128[0]. We need to combine this with our SSE samples. */ 04094 samples128_8 = _mm_alignr_epi8(samples128_4, samples128_8, 4); 04095 samples128_4 = _mm_alignr_epi8(samples128_0, samples128_4, 4); 04096 samples128_0 = _mm_alignr_epi8(prediction128, samples128_0, 4); 04097 04098 /* Slide our rice parameter down so that the value in position 0 contains the next one to process. */ 04099 riceParamPart128 = _mm_alignr_epi8(_mm_setzero_si128(), riceParamPart128, 4); 04100 } 04101 04102 /* We store samples in groups of 4. */ 04103 _mm_storeu_si128((__m128i*)pDecodedSamples, samples128_0); 04104 pDecodedSamples += 4; 04105 } 04106 04107 /* Make sure we process the last few samples. */ 04108 i = (count & ~3); 04109 while (i < (int)count) { 04110 /* Rice extraction. */ 04111 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts0, &riceParamParts0)) { 04112 return DRFLAC_FALSE; 04113 } 04114 04115 /* Rice reconstruction. */ 04116 riceParamParts0 &= riceParamMask; 04117 riceParamParts0 |= (zeroCountParts0 << riceParam); 04118 riceParamParts0 = (riceParamParts0 >> 1) ^ t[riceParamParts0 & 0x01]; 04119 04120 /* Sample reconstruction. */ 04121 pDecodedSamples[0] = riceParamParts0 + drflac__calculate_prediction_64(order, shift, coefficients, pDecodedSamples); 04122 04123 i += 1; 04124 pDecodedSamples += 1; 04125 } 04126 04127 return DRFLAC_TRUE; 04128 } 04129 04130 static drflac_bool32 drflac__decode_samples_with_residual__rice__sse41(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 04131 { 04132 DRFLAC_ASSERT(bs != NULL); 04133 DRFLAC_ASSERT(count > 0); 04134 DRFLAC_ASSERT(pSamplesOut != NULL); 04135 04136 /* In my testing the order is rarely > 12, so in this case I'm going to simplify the SSE implementation by only handling order <= 12. */ 04137 if (order > 0 && order <= 12) { 04138 if (bitsPerSample+shift > 32) { 04139 return drflac__decode_samples_with_residual__rice__sse41_64(bs, count, riceParam, order, shift, coefficients, pSamplesOut); 04140 } else { 04141 return drflac__decode_samples_with_residual__rice__sse41_32(bs, count, riceParam, order, shift, coefficients, pSamplesOut); 04142 } 04143 } else { 04144 return drflac__decode_samples_with_residual__rice__scalar(bs, bitsPerSample, count, riceParam, order, shift, coefficients, pSamplesOut); 04145 } 04146 } 04147 #endif 04148 04149 #if defined(DRFLAC_SUPPORT_NEON) 04150 static DRFLAC_INLINE void drflac__vst2q_s32(drflac_int32* p, int32x4x2_t x) 04151 { 04152 vst1q_s32(p+0, x.val[0]); 04153 vst1q_s32(p+4, x.val[1]); 04154 } 04155 04156 static DRFLAC_INLINE void drflac__vst2q_u32(drflac_uint32* p, uint32x4x2_t x) 04157 { 04158 vst1q_u32(p+0, x.val[0]); 04159 vst1q_u32(p+4, x.val[1]); 04160 } 04161 04162 static DRFLAC_INLINE void drflac__vst2q_f32(float* p, float32x4x2_t x) 04163 { 04164 vst1q_f32(p+0, x.val[0]); 04165 vst1q_f32(p+4, x.val[1]); 04166 } 04167 04168 static DRFLAC_INLINE void drflac__vst2q_s16(drflac_int16* p, int16x4x2_t x) 04169 { 04170 vst1q_s16(p, vcombine_s16(x.val[0], x.val[1])); 04171 } 04172 04173 static DRFLAC_INLINE void drflac__vst2q_u16(drflac_uint16* p, uint16x4x2_t x) 04174 { 04175 vst1q_u16(p, vcombine_u16(x.val[0], x.val[1])); 04176 } 04177 04178 static DRFLAC_INLINE int32x4_t drflac__vdupq_n_s32x4(drflac_int32 x3, drflac_int32 x2, drflac_int32 x1, drflac_int32 x0) 04179 { 04180 drflac_int32 x[4]; 04181 x[3] = x3; 04182 x[2] = x2; 04183 x[1] = x1; 04184 x[0] = x0; 04185 return vld1q_s32(x); 04186 } 04187 04188 static DRFLAC_INLINE int32x4_t drflac__valignrq_s32_1(int32x4_t a, int32x4_t b) 04189 { 04190 /* Equivalent to SSE's _mm_alignr_epi8(a, b, 4) */ 04191 04192 /* Reference */ 04193 /*return drflac__vdupq_n_s32x4( 04194 vgetq_lane_s32(a, 0), 04195 vgetq_lane_s32(b, 3), 04196 vgetq_lane_s32(b, 2), 04197 vgetq_lane_s32(b, 1) 04198 );*/ 04199 04200 return vextq_s32(b, a, 1); 04201 } 04202 04203 static DRFLAC_INLINE uint32x4_t drflac__valignrq_u32_1(uint32x4_t a, uint32x4_t b) 04204 { 04205 /* Equivalent to SSE's _mm_alignr_epi8(a, b, 4) */ 04206 04207 /* Reference */ 04208 /*return drflac__vdupq_n_s32x4( 04209 vgetq_lane_s32(a, 0), 04210 vgetq_lane_s32(b, 3), 04211 vgetq_lane_s32(b, 2), 04212 vgetq_lane_s32(b, 1) 04213 );*/ 04214 04215 return vextq_u32(b, a, 1); 04216 } 04217 04218 static DRFLAC_INLINE int32x2_t drflac__vhaddq_s32(int32x4_t x) 04219 { 04220 /* The sum must end up in position 0. */ 04221 04222 /* Reference */ 04223 /*return vdupq_n_s32( 04224 vgetq_lane_s32(x, 3) + 04225 vgetq_lane_s32(x, 2) + 04226 vgetq_lane_s32(x, 1) + 04227 vgetq_lane_s32(x, 0) 04228 );*/ 04229 04230 int32x2_t r = vadd_s32(vget_high_s32(x), vget_low_s32(x)); 04231 return vpadd_s32(r, r); 04232 } 04233 04234 static DRFLAC_INLINE int64x1_t drflac__vhaddq_s64(int64x2_t x) 04235 { 04236 return vadd_s64(vget_high_s64(x), vget_low_s64(x)); 04237 } 04238 04239 static DRFLAC_INLINE int32x4_t drflac__vrevq_s32(int32x4_t x) 04240 { 04241 /* Reference */ 04242 /*return drflac__vdupq_n_s32x4( 04243 vgetq_lane_s32(x, 0), 04244 vgetq_lane_s32(x, 1), 04245 vgetq_lane_s32(x, 2), 04246 vgetq_lane_s32(x, 3) 04247 );*/ 04248 04249 return vrev64q_s32(vcombine_s32(vget_high_s32(x), vget_low_s32(x))); 04250 } 04251 04252 static DRFLAC_INLINE int32x4_t drflac__vnotq_s32(int32x4_t x) 04253 { 04254 return veorq_s32(x, vdupq_n_s32(0xFFFFFFFF)); 04255 } 04256 04257 static DRFLAC_INLINE uint32x4_t drflac__vnotq_u32(uint32x4_t x) 04258 { 04259 return veorq_u32(x, vdupq_n_u32(0xFFFFFFFF)); 04260 } 04261 04262 static drflac_bool32 drflac__decode_samples_with_residual__rice__neon_32(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 04263 { 04264 int i; 04265 drflac_uint32 riceParamMask; 04266 drflac_int32* pDecodedSamples = pSamplesOut; 04267 drflac_int32* pDecodedSamplesEnd = pSamplesOut + (count & ~3); 04268 drflac_uint32 zeroCountParts[4]; 04269 drflac_uint32 riceParamParts[4]; 04270 int32x4_t coefficients128_0; 04271 int32x4_t coefficients128_4; 04272 int32x4_t coefficients128_8; 04273 int32x4_t samples128_0; 04274 int32x4_t samples128_4; 04275 int32x4_t samples128_8; 04276 uint32x4_t riceParamMask128; 04277 int32x4_t riceParam128; 04278 int32x2_t shift64; 04279 uint32x4_t one128; 04280 04281 const drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF}; 04282 04283 riceParamMask = ~((~0UL) << riceParam); 04284 riceParamMask128 = vdupq_n_u32(riceParamMask); 04285 04286 riceParam128 = vdupq_n_s32(riceParam); 04287 shift64 = vdup_n_s32(-shift); /* Negate the shift because we'll be doing a variable shift using vshlq_s32(). */ 04288 one128 = vdupq_n_u32(1); 04289 04290 /* 04291 Pre-loading the coefficients and prior samples is annoying because we need to ensure we don't try reading more than 04292 what's available in the input buffers. It would be conenient to use a fall-through switch to do this, but this results 04293 in strict aliasing warnings with GCC. To work around this I'm just doing something hacky. This feels a bit convoluted 04294 so I think there's opportunity for this to be simplified. 04295 */ 04296 { 04297 int runningOrder = order; 04298 drflac_int32 tempC[4] = {0, 0, 0, 0}; 04299 drflac_int32 tempS[4] = {0, 0, 0, 0}; 04300 04301 /* 0 - 3. */ 04302 if (runningOrder >= 4) { 04303 coefficients128_0 = vld1q_s32(coefficients + 0); 04304 samples128_0 = vld1q_s32(pSamplesOut - 4); 04305 runningOrder -= 4; 04306 } else { 04307 switch (runningOrder) { 04308 case 3: tempC[2] = coefficients[2]; tempS[1] = pSamplesOut[-3]; /* fallthrough */ 04309 case 2: tempC[1] = coefficients[1]; tempS[2] = pSamplesOut[-2]; /* fallthrough */ 04310 case 1: tempC[0] = coefficients[0]; tempS[3] = pSamplesOut[-1]; /* fallthrough */ 04311 } 04312 04313 coefficients128_0 = vld1q_s32(tempC); 04314 samples128_0 = vld1q_s32(tempS); 04315 runningOrder = 0; 04316 } 04317 04318 /* 4 - 7 */ 04319 if (runningOrder >= 4) { 04320 coefficients128_4 = vld1q_s32(coefficients + 4); 04321 samples128_4 = vld1q_s32(pSamplesOut - 8); 04322 runningOrder -= 4; 04323 } else { 04324 switch (runningOrder) { 04325 case 3: tempC[2] = coefficients[6]; tempS[1] = pSamplesOut[-7]; /* fallthrough */ 04326 case 2: tempC[1] = coefficients[5]; tempS[2] = pSamplesOut[-6]; /* fallthrough */ 04327 case 1: tempC[0] = coefficients[4]; tempS[3] = pSamplesOut[-5]; /* fallthrough */ 04328 } 04329 04330 coefficients128_4 = vld1q_s32(tempC); 04331 samples128_4 = vld1q_s32(tempS); 04332 runningOrder = 0; 04333 } 04334 04335 /* 8 - 11 */ 04336 if (runningOrder == 4) { 04337 coefficients128_8 = vld1q_s32(coefficients + 8); 04338 samples128_8 = vld1q_s32(pSamplesOut - 12); 04339 runningOrder -= 4; 04340 } else { 04341 switch (runningOrder) { 04342 case 3: tempC[2] = coefficients[10]; tempS[1] = pSamplesOut[-11]; /* fallthrough */ 04343 case 2: tempC[1] = coefficients[ 9]; tempS[2] = pSamplesOut[-10]; /* fallthrough */ 04344 case 1: tempC[0] = coefficients[ 8]; tempS[3] = pSamplesOut[- 9]; /* fallthrough */ 04345 } 04346 04347 coefficients128_8 = vld1q_s32(tempC); 04348 samples128_8 = vld1q_s32(tempS); 04349 runningOrder = 0; 04350 } 04351 04352 /* Coefficients need to be shuffled for our streaming algorithm below to work. Samples are already in the correct order from the loading routine above. */ 04353 coefficients128_0 = drflac__vrevq_s32(coefficients128_0); 04354 coefficients128_4 = drflac__vrevq_s32(coefficients128_4); 04355 coefficients128_8 = drflac__vrevq_s32(coefficients128_8); 04356 } 04357 04358 /* For this version we are doing one sample at a time. */ 04359 while (pDecodedSamples < pDecodedSamplesEnd) { 04360 int32x4_t prediction128; 04361 int32x2_t prediction64; 04362 uint32x4_t zeroCountPart128; 04363 uint32x4_t riceParamPart128; 04364 04365 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[0], &riceParamParts[0]) || 04366 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[1], &riceParamParts[1]) || 04367 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[2], &riceParamParts[2]) || 04368 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[3], &riceParamParts[3])) { 04369 return DRFLAC_FALSE; 04370 } 04371 04372 zeroCountPart128 = vld1q_u32(zeroCountParts); 04373 riceParamPart128 = vld1q_u32(riceParamParts); 04374 04375 riceParamPart128 = vandq_u32(riceParamPart128, riceParamMask128); 04376 riceParamPart128 = vorrq_u32(riceParamPart128, vshlq_u32(zeroCountPart128, riceParam128)); 04377 riceParamPart128 = veorq_u32(vshrq_n_u32(riceParamPart128, 1), vaddq_u32(drflac__vnotq_u32(vandq_u32(riceParamPart128, one128)), one128)); 04378 04379 if (order <= 4) { 04380 for (i = 0; i < 4; i += 1) { 04381 prediction128 = vmulq_s32(coefficients128_0, samples128_0); 04382 04383 /* Horizontal add and shift. */ 04384 prediction64 = drflac__vhaddq_s32(prediction128); 04385 prediction64 = vshl_s32(prediction64, shift64); 04386 prediction64 = vadd_s32(prediction64, vget_low_s32(vreinterpretq_s32_u32(riceParamPart128))); 04387 04388 samples128_0 = drflac__valignrq_s32_1(vcombine_s32(prediction64, vdup_n_s32(0)), samples128_0); 04389 riceParamPart128 = drflac__valignrq_u32_1(vdupq_n_u32(0), riceParamPart128); 04390 } 04391 } else if (order <= 8) { 04392 for (i = 0; i < 4; i += 1) { 04393 prediction128 = vmulq_s32(coefficients128_4, samples128_4); 04394 prediction128 = vmlaq_s32(prediction128, coefficients128_0, samples128_0); 04395 04396 /* Horizontal add and shift. */ 04397 prediction64 = drflac__vhaddq_s32(prediction128); 04398 prediction64 = vshl_s32(prediction64, shift64); 04399 prediction64 = vadd_s32(prediction64, vget_low_s32(vreinterpretq_s32_u32(riceParamPart128))); 04400 04401 samples128_4 = drflac__valignrq_s32_1(samples128_0, samples128_4); 04402 samples128_0 = drflac__valignrq_s32_1(vcombine_s32(prediction64, vdup_n_s32(0)), samples128_0); 04403 riceParamPart128 = drflac__valignrq_u32_1(vdupq_n_u32(0), riceParamPart128); 04404 } 04405 } else { 04406 for (i = 0; i < 4; i += 1) { 04407 prediction128 = vmulq_s32(coefficients128_8, samples128_8); 04408 prediction128 = vmlaq_s32(prediction128, coefficients128_4, samples128_4); 04409 prediction128 = vmlaq_s32(prediction128, coefficients128_0, samples128_0); 04410 04411 /* Horizontal add and shift. */ 04412 prediction64 = drflac__vhaddq_s32(prediction128); 04413 prediction64 = vshl_s32(prediction64, shift64); 04414 prediction64 = vadd_s32(prediction64, vget_low_s32(vreinterpretq_s32_u32(riceParamPart128))); 04415 04416 samples128_8 = drflac__valignrq_s32_1(samples128_4, samples128_8); 04417 samples128_4 = drflac__valignrq_s32_1(samples128_0, samples128_4); 04418 samples128_0 = drflac__valignrq_s32_1(vcombine_s32(prediction64, vdup_n_s32(0)), samples128_0); 04419 riceParamPart128 = drflac__valignrq_u32_1(vdupq_n_u32(0), riceParamPart128); 04420 } 04421 } 04422 04423 /* We store samples in groups of 4. */ 04424 vst1q_s32(pDecodedSamples, samples128_0); 04425 pDecodedSamples += 4; 04426 } 04427 04428 /* Make sure we process the last few samples. */ 04429 i = (count & ~3); 04430 while (i < (int)count) { 04431 /* Rice extraction. */ 04432 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[0], &riceParamParts[0])) { 04433 return DRFLAC_FALSE; 04434 } 04435 04436 /* Rice reconstruction. */ 04437 riceParamParts[0] &= riceParamMask; 04438 riceParamParts[0] |= (zeroCountParts[0] << riceParam); 04439 riceParamParts[0] = (riceParamParts[0] >> 1) ^ t[riceParamParts[0] & 0x01]; 04440 04441 /* Sample reconstruction. */ 04442 pDecodedSamples[0] = riceParamParts[0] + drflac__calculate_prediction_32(order, shift, coefficients, pDecodedSamples); 04443 04444 i += 1; 04445 pDecodedSamples += 1; 04446 } 04447 04448 return DRFLAC_TRUE; 04449 } 04450 04451 static drflac_bool32 drflac__decode_samples_with_residual__rice__neon_64(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 04452 { 04453 int i; 04454 drflac_uint32 riceParamMask; 04455 drflac_int32* pDecodedSamples = pSamplesOut; 04456 drflac_int32* pDecodedSamplesEnd = pSamplesOut + (count & ~3); 04457 drflac_uint32 zeroCountParts[4]; 04458 drflac_uint32 riceParamParts[4]; 04459 int32x4_t coefficients128_0; 04460 int32x4_t coefficients128_4; 04461 int32x4_t coefficients128_8; 04462 int32x4_t samples128_0; 04463 int32x4_t samples128_4; 04464 int32x4_t samples128_8; 04465 uint32x4_t riceParamMask128; 04466 int32x4_t riceParam128; 04467 int64x1_t shift64; 04468 uint32x4_t one128; 04469 04470 const drflac_uint32 t[2] = {0x00000000, 0xFFFFFFFF}; 04471 04472 riceParamMask = ~((~0UL) << riceParam); 04473 riceParamMask128 = vdupq_n_u32(riceParamMask); 04474 04475 riceParam128 = vdupq_n_s32(riceParam); 04476 shift64 = vdup_n_s64(-shift); /* Negate the shift because we'll be doing a variable shift using vshlq_s32(). */ 04477 one128 = vdupq_n_u32(1); 04478 04479 /* 04480 Pre-loading the coefficients and prior samples is annoying because we need to ensure we don't try reading more than 04481 what's available in the input buffers. It would be conenient to use a fall-through switch to do this, but this results 04482 in strict aliasing warnings with GCC. To work around this I'm just doing something hacky. This feels a bit convoluted 04483 so I think there's opportunity for this to be simplified. 04484 */ 04485 { 04486 int runningOrder = order; 04487 drflac_int32 tempC[4] = {0, 0, 0, 0}; 04488 drflac_int32 tempS[4] = {0, 0, 0, 0}; 04489 04490 /* 0 - 3. */ 04491 if (runningOrder >= 4) { 04492 coefficients128_0 = vld1q_s32(coefficients + 0); 04493 samples128_0 = vld1q_s32(pSamplesOut - 4); 04494 runningOrder -= 4; 04495 } else { 04496 switch (runningOrder) { 04497 case 3: tempC[2] = coefficients[2]; tempS[1] = pSamplesOut[-3]; /* fallthrough */ 04498 case 2: tempC[1] = coefficients[1]; tempS[2] = pSamplesOut[-2]; /* fallthrough */ 04499 case 1: tempC[0] = coefficients[0]; tempS[3] = pSamplesOut[-1]; /* fallthrough */ 04500 } 04501 04502 coefficients128_0 = vld1q_s32(tempC); 04503 samples128_0 = vld1q_s32(tempS); 04504 runningOrder = 0; 04505 } 04506 04507 /* 4 - 7 */ 04508 if (runningOrder >= 4) { 04509 coefficients128_4 = vld1q_s32(coefficients + 4); 04510 samples128_4 = vld1q_s32(pSamplesOut - 8); 04511 runningOrder -= 4; 04512 } else { 04513 switch (runningOrder) { 04514 case 3: tempC[2] = coefficients[6]; tempS[1] = pSamplesOut[-7]; /* fallthrough */ 04515 case 2: tempC[1] = coefficients[5]; tempS[2] = pSamplesOut[-6]; /* fallthrough */ 04516 case 1: tempC[0] = coefficients[4]; tempS[3] = pSamplesOut[-5]; /* fallthrough */ 04517 } 04518 04519 coefficients128_4 = vld1q_s32(tempC); 04520 samples128_4 = vld1q_s32(tempS); 04521 runningOrder = 0; 04522 } 04523 04524 /* 8 - 11 */ 04525 if (runningOrder == 4) { 04526 coefficients128_8 = vld1q_s32(coefficients + 8); 04527 samples128_8 = vld1q_s32(pSamplesOut - 12); 04528 runningOrder -= 4; 04529 } else { 04530 switch (runningOrder) { 04531 case 3: tempC[2] = coefficients[10]; tempS[1] = pSamplesOut[-11]; /* fallthrough */ 04532 case 2: tempC[1] = coefficients[ 9]; tempS[2] = pSamplesOut[-10]; /* fallthrough */ 04533 case 1: tempC[0] = coefficients[ 8]; tempS[3] = pSamplesOut[- 9]; /* fallthrough */ 04534 } 04535 04536 coefficients128_8 = vld1q_s32(tempC); 04537 samples128_8 = vld1q_s32(tempS); 04538 runningOrder = 0; 04539 } 04540 04541 /* Coefficients need to be shuffled for our streaming algorithm below to work. Samples are already in the correct order from the loading routine above. */ 04542 coefficients128_0 = drflac__vrevq_s32(coefficients128_0); 04543 coefficients128_4 = drflac__vrevq_s32(coefficients128_4); 04544 coefficients128_8 = drflac__vrevq_s32(coefficients128_8); 04545 } 04546 04547 /* For this version we are doing one sample at a time. */ 04548 while (pDecodedSamples < pDecodedSamplesEnd) { 04549 int64x2_t prediction128; 04550 uint32x4_t zeroCountPart128; 04551 uint32x4_t riceParamPart128; 04552 04553 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[0], &riceParamParts[0]) || 04554 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[1], &riceParamParts[1]) || 04555 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[2], &riceParamParts[2]) || 04556 !drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[3], &riceParamParts[3])) { 04557 return DRFLAC_FALSE; 04558 } 04559 04560 zeroCountPart128 = vld1q_u32(zeroCountParts); 04561 riceParamPart128 = vld1q_u32(riceParamParts); 04562 04563 riceParamPart128 = vandq_u32(riceParamPart128, riceParamMask128); 04564 riceParamPart128 = vorrq_u32(riceParamPart128, vshlq_u32(zeroCountPart128, riceParam128)); 04565 riceParamPart128 = veorq_u32(vshrq_n_u32(riceParamPart128, 1), vaddq_u32(drflac__vnotq_u32(vandq_u32(riceParamPart128, one128)), one128)); 04566 04567 for (i = 0; i < 4; i += 1) { 04568 int64x1_t prediction64; 04569 04570 prediction128 = veorq_s64(prediction128, prediction128); /* Reset to 0. */ 04571 switch (order) 04572 { 04573 case 12: 04574 case 11: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_low_s32(coefficients128_8), vget_low_s32(samples128_8))); 04575 case 10: 04576 case 9: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_high_s32(coefficients128_8), vget_high_s32(samples128_8))); 04577 case 8: 04578 case 7: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_low_s32(coefficients128_4), vget_low_s32(samples128_4))); 04579 case 6: 04580 case 5: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_high_s32(coefficients128_4), vget_high_s32(samples128_4))); 04581 case 4: 04582 case 3: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_low_s32(coefficients128_0), vget_low_s32(samples128_0))); 04583 case 2: 04584 case 1: prediction128 = vaddq_s64(prediction128, vmull_s32(vget_high_s32(coefficients128_0), vget_high_s32(samples128_0))); 04585 } 04586 04587 /* Horizontal add and shift. */ 04588 prediction64 = drflac__vhaddq_s64(prediction128); 04589 prediction64 = vshl_s64(prediction64, shift64); 04590 prediction64 = vadd_s64(prediction64, vdup_n_s64(vgetq_lane_u32(riceParamPart128, 0))); 04591 04592 /* Our value should be sitting in prediction64[0]. We need to combine this with our SSE samples. */ 04593 samples128_8 = drflac__valignrq_s32_1(samples128_4, samples128_8); 04594 samples128_4 = drflac__valignrq_s32_1(samples128_0, samples128_4); 04595 samples128_0 = drflac__valignrq_s32_1(vcombine_s32(vreinterpret_s32_s64(prediction64), vdup_n_s32(0)), samples128_0); 04596 04597 /* Slide our rice parameter down so that the value in position 0 contains the next one to process. */ 04598 riceParamPart128 = drflac__valignrq_u32_1(vdupq_n_u32(0), riceParamPart128); 04599 } 04600 04601 /* We store samples in groups of 4. */ 04602 vst1q_s32(pDecodedSamples, samples128_0); 04603 pDecodedSamples += 4; 04604 } 04605 04606 /* Make sure we process the last few samples. */ 04607 i = (count & ~3); 04608 while (i < (int)count) { 04609 /* Rice extraction. */ 04610 if (!drflac__read_rice_parts_x1(bs, riceParam, &zeroCountParts[0], &riceParamParts[0])) { 04611 return DRFLAC_FALSE; 04612 } 04613 04614 /* Rice reconstruction. */ 04615 riceParamParts[0] &= riceParamMask; 04616 riceParamParts[0] |= (zeroCountParts[0] << riceParam); 04617 riceParamParts[0] = (riceParamParts[0] >> 1) ^ t[riceParamParts[0] & 0x01]; 04618 04619 /* Sample reconstruction. */ 04620 pDecodedSamples[0] = riceParamParts[0] + drflac__calculate_prediction_64(order, shift, coefficients, pDecodedSamples); 04621 04622 i += 1; 04623 pDecodedSamples += 1; 04624 } 04625 04626 return DRFLAC_TRUE; 04627 } 04628 04629 static drflac_bool32 drflac__decode_samples_with_residual__rice__neon(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 04630 { 04631 DRFLAC_ASSERT(bs != NULL); 04632 DRFLAC_ASSERT(count > 0); 04633 DRFLAC_ASSERT(pSamplesOut != NULL); 04634 04635 /* In my testing the order is rarely > 12, so in this case I'm going to simplify the NEON implementation by only handling order <= 12. */ 04636 if (order > 0 && order <= 12) { 04637 if (bitsPerSample+shift > 32) { 04638 return drflac__decode_samples_with_residual__rice__neon_64(bs, count, riceParam, order, shift, coefficients, pSamplesOut); 04639 } else { 04640 return drflac__decode_samples_with_residual__rice__neon_32(bs, count, riceParam, order, shift, coefficients, pSamplesOut); 04641 } 04642 } else { 04643 return drflac__decode_samples_with_residual__rice__scalar(bs, bitsPerSample, count, riceParam, order, shift, coefficients, pSamplesOut); 04644 } 04645 } 04646 #endif 04647 04648 static drflac_bool32 drflac__decode_samples_with_residual__rice(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 riceParam, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 04649 { 04650 #if defined(DRFLAC_SUPPORT_SSE41) 04651 if (drflac__gIsSSE41Supported) { 04652 return drflac__decode_samples_with_residual__rice__sse41(bs, bitsPerSample, count, riceParam, order, shift, coefficients, pSamplesOut); 04653 } else 04654 #elif defined(DRFLAC_SUPPORT_NEON) 04655 if (drflac__gIsNEONSupported) { 04656 return drflac__decode_samples_with_residual__rice__neon(bs, bitsPerSample, count, riceParam, order, shift, coefficients, pSamplesOut); 04657 } else 04658 #endif 04659 { 04660 /* Scalar fallback. */ 04661 #if 0 04662 return drflac__decode_samples_with_residual__rice__reference(bs, bitsPerSample, count, riceParam, order, shift, coefficients, pSamplesOut); 04663 #else 04664 return drflac__decode_samples_with_residual__rice__scalar(bs, bitsPerSample, count, riceParam, order, shift, coefficients, pSamplesOut); 04665 #endif 04666 } 04667 } 04668 04669 /* Reads and seeks past a string of residual values as Rice codes. The decoder should be sitting on the first bit of the Rice codes. */ 04670 static drflac_bool32 drflac__read_and_seek_residual__rice(drflac_bs* bs, drflac_uint32 count, drflac_uint8 riceParam) 04671 { 04672 drflac_uint32 i; 04673 04674 DRFLAC_ASSERT(bs != NULL); 04675 DRFLAC_ASSERT(count > 0); 04676 04677 for (i = 0; i < count; ++i) { 04678 if (!drflac__seek_rice_parts(bs, riceParam)) { 04679 return DRFLAC_FALSE; 04680 } 04681 } 04682 04683 return DRFLAC_TRUE; 04684 } 04685 04686 static drflac_bool32 drflac__decode_samples_with_residual__unencoded(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 count, drflac_uint8 unencodedBitsPerSample, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pSamplesOut) 04687 { 04688 drflac_uint32 i; 04689 04690 DRFLAC_ASSERT(bs != NULL); 04691 DRFLAC_ASSERT(count > 0); 04692 DRFLAC_ASSERT(unencodedBitsPerSample <= 31); /* <-- unencodedBitsPerSample is a 5 bit number, so cannot exceed 31. */ 04693 DRFLAC_ASSERT(pSamplesOut != NULL); 04694 04695 for (i = 0; i < count; ++i) { 04696 if (unencodedBitsPerSample > 0) { 04697 if (!drflac__read_int32(bs, unencodedBitsPerSample, pSamplesOut + i)) { 04698 return DRFLAC_FALSE; 04699 } 04700 } else { 04701 pSamplesOut[i] = 0; 04702 } 04703 04704 if (bitsPerSample >= 24) { 04705 pSamplesOut[i] += drflac__calculate_prediction_64(order, shift, coefficients, pSamplesOut + i); 04706 } else { 04707 pSamplesOut[i] += drflac__calculate_prediction_32(order, shift, coefficients, pSamplesOut + i); 04708 } 04709 } 04710 04711 return DRFLAC_TRUE; 04712 } 04713 04714 04715 /* 04716 Reads and decodes the residual for the sub-frame the decoder is currently sitting on. This function should be called 04717 when the decoder is sitting at the very start of the RESIDUAL block. The first <order> residuals will be ignored. The 04718 <blockSize> and <order> parameters are used to determine how many residual values need to be decoded. 04719 */ 04720 static drflac_bool32 drflac__decode_samples_with_residual(drflac_bs* bs, drflac_uint32 bitsPerSample, drflac_uint32 blockSize, drflac_uint32 order, drflac_int32 shift, const drflac_int32* coefficients, drflac_int32* pDecodedSamples) 04721 { 04722 drflac_uint8 residualMethod; 04723 drflac_uint8 partitionOrder; 04724 drflac_uint32 samplesInPartition; 04725 drflac_uint32 partitionsRemaining; 04726 04727 DRFLAC_ASSERT(bs != NULL); 04728 DRFLAC_ASSERT(blockSize != 0); 04729 DRFLAC_ASSERT(pDecodedSamples != NULL); /* <-- Should we allow NULL, in which case we just seek past the residual rather than do a full decode? */ 04730 04731 if (!drflac__read_uint8(bs, 2, &residualMethod)) { 04732 return DRFLAC_FALSE; 04733 } 04734 04735 if (residualMethod != DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE && residualMethod != DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2) { 04736 return DRFLAC_FALSE; /* Unknown or unsupported residual coding method. */ 04737 } 04738 04739 /* Ignore the first <order> values. */ 04740 pDecodedSamples += order; 04741 04742 if (!drflac__read_uint8(bs, 4, &partitionOrder)) { 04743 return DRFLAC_FALSE; 04744 } 04745 04746 /* 04747 From the FLAC spec: 04748 The Rice partition order in a Rice-coded residual section must be less than or equal to 8. 04749 */ 04750 if (partitionOrder > 8) { 04751 return DRFLAC_FALSE; 04752 } 04753 04754 /* Validation check. */ 04755 if ((blockSize / (1 << partitionOrder)) <= order) { 04756 return DRFLAC_FALSE; 04757 } 04758 04759 samplesInPartition = (blockSize / (1 << partitionOrder)) - order; 04760 partitionsRemaining = (1 << partitionOrder); 04761 for (;;) { 04762 drflac_uint8 riceParam = 0; 04763 if (residualMethod == DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE) { 04764 if (!drflac__read_uint8(bs, 4, &riceParam)) { 04765 return DRFLAC_FALSE; 04766 } 04767 if (riceParam == 15) { 04768 riceParam = 0xFF; 04769 } 04770 } else if (residualMethod == DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2) { 04771 if (!drflac__read_uint8(bs, 5, &riceParam)) { 04772 return DRFLAC_FALSE; 04773 } 04774 if (riceParam == 31) { 04775 riceParam = 0xFF; 04776 } 04777 } 04778 04779 if (riceParam != 0xFF) { 04780 if (!drflac__decode_samples_with_residual__rice(bs, bitsPerSample, samplesInPartition, riceParam, order, shift, coefficients, pDecodedSamples)) { 04781 return DRFLAC_FALSE; 04782 } 04783 } else { 04784 drflac_uint8 unencodedBitsPerSample = 0; 04785 if (!drflac__read_uint8(bs, 5, &unencodedBitsPerSample)) { 04786 return DRFLAC_FALSE; 04787 } 04788 04789 if (!drflac__decode_samples_with_residual__unencoded(bs, bitsPerSample, samplesInPartition, unencodedBitsPerSample, order, shift, coefficients, pDecodedSamples)) { 04790 return DRFLAC_FALSE; 04791 } 04792 } 04793 04794 pDecodedSamples += samplesInPartition; 04795 04796 if (partitionsRemaining == 1) { 04797 break; 04798 } 04799 04800 partitionsRemaining -= 1; 04801 04802 if (partitionOrder != 0) { 04803 samplesInPartition = blockSize / (1 << partitionOrder); 04804 } 04805 } 04806 04807 return DRFLAC_TRUE; 04808 } 04809 04810 /* 04811 Reads and seeks past the residual for the sub-frame the decoder is currently sitting on. This function should be called 04812 when the decoder is sitting at the very start of the RESIDUAL block. The first <order> residuals will be set to 0. The 04813 <blockSize> and <order> parameters are used to determine how many residual values need to be decoded. 04814 */ 04815 static drflac_bool32 drflac__read_and_seek_residual(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 order) 04816 { 04817 drflac_uint8 residualMethod; 04818 drflac_uint8 partitionOrder; 04819 drflac_uint32 samplesInPartition; 04820 drflac_uint32 partitionsRemaining; 04821 04822 DRFLAC_ASSERT(bs != NULL); 04823 DRFLAC_ASSERT(blockSize != 0); 04824 04825 if (!drflac__read_uint8(bs, 2, &residualMethod)) { 04826 return DRFLAC_FALSE; 04827 } 04828 04829 if (residualMethod != DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE && residualMethod != DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2) { 04830 return DRFLAC_FALSE; /* Unknown or unsupported residual coding method. */ 04831 } 04832 04833 if (!drflac__read_uint8(bs, 4, &partitionOrder)) { 04834 return DRFLAC_FALSE; 04835 } 04836 04837 /* 04838 From the FLAC spec: 04839 The Rice partition order in a Rice-coded residual section must be less than or equal to 8. 04840 */ 04841 if (partitionOrder > 8) { 04842 return DRFLAC_FALSE; 04843 } 04844 04845 /* Validation check. */ 04846 if ((blockSize / (1 << partitionOrder)) <= order) { 04847 return DRFLAC_FALSE; 04848 } 04849 04850 samplesInPartition = (blockSize / (1 << partitionOrder)) - order; 04851 partitionsRemaining = (1 << partitionOrder); 04852 for (;;) 04853 { 04854 drflac_uint8 riceParam = 0; 04855 if (residualMethod == DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE) { 04856 if (!drflac__read_uint8(bs, 4, &riceParam)) { 04857 return DRFLAC_FALSE; 04858 } 04859 if (riceParam == 15) { 04860 riceParam = 0xFF; 04861 } 04862 } else if (residualMethod == DRFLAC_RESIDUAL_CODING_METHOD_PARTITIONED_RICE2) { 04863 if (!drflac__read_uint8(bs, 5, &riceParam)) { 04864 return DRFLAC_FALSE; 04865 } 04866 if (riceParam == 31) { 04867 riceParam = 0xFF; 04868 } 04869 } 04870 04871 if (riceParam != 0xFF) { 04872 if (!drflac__read_and_seek_residual__rice(bs, samplesInPartition, riceParam)) { 04873 return DRFLAC_FALSE; 04874 } 04875 } else { 04876 drflac_uint8 unencodedBitsPerSample = 0; 04877 if (!drflac__read_uint8(bs, 5, &unencodedBitsPerSample)) { 04878 return DRFLAC_FALSE; 04879 } 04880 04881 if (!drflac__seek_bits(bs, unencodedBitsPerSample * samplesInPartition)) { 04882 return DRFLAC_FALSE; 04883 } 04884 } 04885 04886 04887 if (partitionsRemaining == 1) { 04888 break; 04889 } 04890 04891 partitionsRemaining -= 1; 04892 samplesInPartition = blockSize / (1 << partitionOrder); 04893 } 04894 04895 return DRFLAC_TRUE; 04896 } 04897 04898 04899 static drflac_bool32 drflac__decode_samples__constant(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 subframeBitsPerSample, drflac_int32* pDecodedSamples) 04900 { 04901 drflac_uint32 i; 04902 04903 /* Only a single sample needs to be decoded here. */ 04904 drflac_int32 sample; 04905 if (!drflac__read_int32(bs, subframeBitsPerSample, &sample)) { 04906 return DRFLAC_FALSE; 04907 } 04908 04909 /* 04910 We don't really need to expand this, but it does simplify the process of reading samples. If this becomes a performance issue (unlikely) 04911 we'll want to look at a more efficient way. 04912 */ 04913 for (i = 0; i < blockSize; ++i) { 04914 pDecodedSamples[i] = sample; 04915 } 04916 04917 return DRFLAC_TRUE; 04918 } 04919 04920 static drflac_bool32 drflac__decode_samples__verbatim(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 subframeBitsPerSample, drflac_int32* pDecodedSamples) 04921 { 04922 drflac_uint32 i; 04923 04924 for (i = 0; i < blockSize; ++i) { 04925 drflac_int32 sample; 04926 if (!drflac__read_int32(bs, subframeBitsPerSample, &sample)) { 04927 return DRFLAC_FALSE; 04928 } 04929 04930 pDecodedSamples[i] = sample; 04931 } 04932 04933 return DRFLAC_TRUE; 04934 } 04935 04936 static drflac_bool32 drflac__decode_samples__fixed(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 subframeBitsPerSample, drflac_uint8 lpcOrder, drflac_int32* pDecodedSamples) 04937 { 04938 drflac_uint32 i; 04939 04940 static drflac_int32 lpcCoefficientsTable[5][4] = { 04941 {0, 0, 0, 0}, 04942 {1, 0, 0, 0}, 04943 {2, -1, 0, 0}, 04944 {3, -3, 1, 0}, 04945 {4, -6, 4, -1} 04946 }; 04947 04948 /* Warm up samples and coefficients. */ 04949 for (i = 0; i < lpcOrder; ++i) { 04950 drflac_int32 sample; 04951 if (!drflac__read_int32(bs, subframeBitsPerSample, &sample)) { 04952 return DRFLAC_FALSE; 04953 } 04954 04955 pDecodedSamples[i] = sample; 04956 } 04957 04958 if (!drflac__decode_samples_with_residual(bs, subframeBitsPerSample, blockSize, lpcOrder, 0, lpcCoefficientsTable[lpcOrder], pDecodedSamples)) { 04959 return DRFLAC_FALSE; 04960 } 04961 04962 return DRFLAC_TRUE; 04963 } 04964 04965 static drflac_bool32 drflac__decode_samples__lpc(drflac_bs* bs, drflac_uint32 blockSize, drflac_uint32 bitsPerSample, drflac_uint8 lpcOrder, drflac_int32* pDecodedSamples) 04966 { 04967 drflac_uint8 i; 04968 drflac_uint8 lpcPrecision; 04969 drflac_int8 lpcShift; 04970 drflac_int32 coefficients[32]; 04971 04972 /* Warm up samples. */ 04973 for (i = 0; i < lpcOrder; ++i) { 04974 drflac_int32 sample; 04975 if (!drflac__read_int32(bs, bitsPerSample, &sample)) { 04976 return DRFLAC_FALSE; 04977 } 04978 04979 pDecodedSamples[i] = sample; 04980 } 04981 04982 if (!drflac__read_uint8(bs, 4, &lpcPrecision)) { 04983 return DRFLAC_FALSE; 04984 } 04985 if (lpcPrecision == 15) { 04986 return DRFLAC_FALSE; /* Invalid. */ 04987 } 04988 lpcPrecision += 1; 04989 04990 if (!drflac__read_int8(bs, 5, &lpcShift)) { 04991 return DRFLAC_FALSE; 04992 } 04993 04994 DRFLAC_ZERO_MEMORY(coefficients, sizeof(coefficients)); 04995 for (i = 0; i < lpcOrder; ++i) { 04996 if (!drflac__read_int32(bs, lpcPrecision, coefficients + i)) { 04997 return DRFLAC_FALSE; 04998 } 04999 } 05000 05001 if (!drflac__decode_samples_with_residual(bs, bitsPerSample, blockSize, lpcOrder, lpcShift, coefficients, pDecodedSamples)) { 05002 return DRFLAC_FALSE; 05003 } 05004 05005 return DRFLAC_TRUE; 05006 } 05007 05008 05009 static drflac_bool32 drflac__read_next_flac_frame_header(drflac_bs* bs, drflac_uint8 streaminfoBitsPerSample, drflac_frame_header* header) 05010 { 05011 const drflac_uint32 sampleRateTable[12] = {0, 88200, 176400, 192000, 8000, 16000, 22050, 24000, 32000, 44100, 48000, 96000}; 05012 const drflac_uint8 bitsPerSampleTable[8] = {0, 8, 12, (drflac_uint8)-1, 16, 20, 24, (drflac_uint8)-1}; /* -1 = reserved. */ 05013 05014 DRFLAC_ASSERT(bs != NULL); 05015 DRFLAC_ASSERT(header != NULL); 05016 05017 /* Keep looping until we find a valid sync code. */ 05018 for (;;) { 05019 drflac_uint8 crc8 = 0xCE; /* 0xCE = drflac_crc8(0, 0x3FFE, 14); */ 05020 drflac_uint8 reserved = 0; 05021 drflac_uint8 blockingStrategy = 0; 05022 drflac_uint8 blockSize = 0; 05023 drflac_uint8 sampleRate = 0; 05024 drflac_uint8 channelAssignment = 0; 05025 drflac_uint8 bitsPerSample = 0; 05026 drflac_bool32 isVariableBlockSize; 05027 05028 if (!drflac__find_and_seek_to_next_sync_code(bs)) { 05029 return DRFLAC_FALSE; 05030 } 05031 05032 if (!drflac__read_uint8(bs, 1, &reserved)) { 05033 return DRFLAC_FALSE; 05034 } 05035 if (reserved == 1) { 05036 continue; 05037 } 05038 crc8 = drflac_crc8(crc8, reserved, 1); 05039 05040 if (!drflac__read_uint8(bs, 1, &blockingStrategy)) { 05041 return DRFLAC_FALSE; 05042 } 05043 crc8 = drflac_crc8(crc8, blockingStrategy, 1); 05044 05045 if (!drflac__read_uint8(bs, 4, &blockSize)) { 05046 return DRFLAC_FALSE; 05047 } 05048 if (blockSize == 0) { 05049 continue; 05050 } 05051 crc8 = drflac_crc8(crc8, blockSize, 4); 05052 05053 if (!drflac__read_uint8(bs, 4, &sampleRate)) { 05054 return DRFLAC_FALSE; 05055 } 05056 crc8 = drflac_crc8(crc8, sampleRate, 4); 05057 05058 if (!drflac__read_uint8(bs, 4, &channelAssignment)) { 05059 return DRFLAC_FALSE; 05060 } 05061 if (channelAssignment > 10) { 05062 continue; 05063 } 05064 crc8 = drflac_crc8(crc8, channelAssignment, 4); 05065 05066 if (!drflac__read_uint8(bs, 3, &bitsPerSample)) { 05067 return DRFLAC_FALSE; 05068 } 05069 if (bitsPerSample == 3 || bitsPerSample == 7) { 05070 continue; 05071 } 05072 crc8 = drflac_crc8(crc8, bitsPerSample, 3); 05073 05074 05075 if (!drflac__read_uint8(bs, 1, &reserved)) { 05076 return DRFLAC_FALSE; 05077 } 05078 if (reserved == 1) { 05079 continue; 05080 } 05081 crc8 = drflac_crc8(crc8, reserved, 1); 05082 05083 05084 isVariableBlockSize = blockingStrategy == 1; 05085 if (isVariableBlockSize) { 05086 drflac_uint64 pcmFrameNumber; 05087 drflac_result result = drflac__read_utf8_coded_number(bs, &pcmFrameNumber, &crc8); 05088 if (result != DRFLAC_SUCCESS) { 05089 if (result == DRFLAC_AT_END) { 05090 return DRFLAC_FALSE; 05091 } else { 05092 continue; 05093 } 05094 } 05095 header->flacFrameNumber = 0; 05096 header->pcmFrameNumber = pcmFrameNumber; 05097 } else { 05098 drflac_uint64 flacFrameNumber = 0; 05099 drflac_result result = drflac__read_utf8_coded_number(bs, &flacFrameNumber, &crc8); 05100 if (result != DRFLAC_SUCCESS) { 05101 if (result == DRFLAC_AT_END) { 05102 return DRFLAC_FALSE; 05103 } else { 05104 continue; 05105 } 05106 } 05107 header->flacFrameNumber = (drflac_uint32)flacFrameNumber; /* <-- Safe cast. */ 05108 header->pcmFrameNumber = 0; 05109 } 05110 05111 05112 DRFLAC_ASSERT(blockSize > 0); 05113 if (blockSize == 1) { 05114 header->blockSizeInPCMFrames = 192; 05115 } else if (blockSize >= 2 && blockSize <= 5) { 05116 header->blockSizeInPCMFrames = 576 * (1 << (blockSize - 2)); 05117 } else if (blockSize == 6) { 05118 if (!drflac__read_uint16(bs, 8, &header->blockSizeInPCMFrames)) { 05119 return DRFLAC_FALSE; 05120 } 05121 crc8 = drflac_crc8(crc8, header->blockSizeInPCMFrames, 8); 05122 header->blockSizeInPCMFrames += 1; 05123 } else if (blockSize == 7) { 05124 if (!drflac__read_uint16(bs, 16, &header->blockSizeInPCMFrames)) { 05125 return DRFLAC_FALSE; 05126 } 05127 crc8 = drflac_crc8(crc8, header->blockSizeInPCMFrames, 16); 05128 header->blockSizeInPCMFrames += 1; 05129 } else { 05130 DRFLAC_ASSERT(blockSize >= 8); 05131 header->blockSizeInPCMFrames = 256 * (1 << (blockSize - 8)); 05132 } 05133 05134 05135 if (sampleRate <= 11) { 05136 header->sampleRate = sampleRateTable[sampleRate]; 05137 } else if (sampleRate == 12) { 05138 if (!drflac__read_uint32(bs, 8, &header->sampleRate)) { 05139 return DRFLAC_FALSE; 05140 } 05141 crc8 = drflac_crc8(crc8, header->sampleRate, 8); 05142 header->sampleRate *= 1000; 05143 } else if (sampleRate == 13) { 05144 if (!drflac__read_uint32(bs, 16, &header->sampleRate)) { 05145 return DRFLAC_FALSE; 05146 } 05147 crc8 = drflac_crc8(crc8, header->sampleRate, 16); 05148 } else if (sampleRate == 14) { 05149 if (!drflac__read_uint32(bs, 16, &header->sampleRate)) { 05150 return DRFLAC_FALSE; 05151 } 05152 crc8 = drflac_crc8(crc8, header->sampleRate, 16); 05153 header->sampleRate *= 10; 05154 } else { 05155 continue; /* Invalid. Assume an invalid block. */ 05156 } 05157 05158 05159 header->channelAssignment = channelAssignment; 05160 05161 header->bitsPerSample = bitsPerSampleTable[bitsPerSample]; 05162 if (header->bitsPerSample == 0) { 05163 header->bitsPerSample = streaminfoBitsPerSample; 05164 } 05165 05166 if (!drflac__read_uint8(bs, 8, &header->crc8)) { 05167 return DRFLAC_FALSE; 05168 } 05169 05170 #ifndef DR_FLAC_NO_CRC 05171 if (header->crc8 != crc8) { 05172 continue; /* CRC mismatch. Loop back to the top and find the next sync code. */ 05173 } 05174 #endif 05175 return DRFLAC_TRUE; 05176 } 05177 } 05178 05179 static drflac_bool32 drflac__read_subframe_header(drflac_bs* bs, drflac_subframe* pSubframe) 05180 { 05181 drflac_uint8 header; 05182 int type; 05183 05184 if (!drflac__read_uint8(bs, 8, &header)) { 05185 return DRFLAC_FALSE; 05186 } 05187 05188 /* First bit should always be 0. */ 05189 if ((header & 0x80) != 0) { 05190 return DRFLAC_FALSE; 05191 } 05192 05193 type = (header & 0x7E) >> 1; 05194 if (type == 0) { 05195 pSubframe->subframeType = DRFLAC_SUBFRAME_CONSTANT; 05196 } else if (type == 1) { 05197 pSubframe->subframeType = DRFLAC_SUBFRAME_VERBATIM; 05198 } else { 05199 if ((type & 0x20) != 0) { 05200 pSubframe->subframeType = DRFLAC_SUBFRAME_LPC; 05201 pSubframe->lpcOrder = (drflac_uint8)(type & 0x1F) + 1; 05202 } else if ((type & 0x08) != 0) { 05203 pSubframe->subframeType = DRFLAC_SUBFRAME_FIXED; 05204 pSubframe->lpcOrder = (drflac_uint8)(type & 0x07); 05205 if (pSubframe->lpcOrder > 4) { 05206 pSubframe->subframeType = DRFLAC_SUBFRAME_RESERVED; 05207 pSubframe->lpcOrder = 0; 05208 } 05209 } else { 05210 pSubframe->subframeType = DRFLAC_SUBFRAME_RESERVED; 05211 } 05212 } 05213 05214 if (pSubframe->subframeType == DRFLAC_SUBFRAME_RESERVED) { 05215 return DRFLAC_FALSE; 05216 } 05217 05218 /* Wasted bits per sample. */ 05219 pSubframe->wastedBitsPerSample = 0; 05220 if ((header & 0x01) == 1) { 05221 unsigned int wastedBitsPerSample; 05222 if (!drflac__seek_past_next_set_bit(bs, &wastedBitsPerSample)) { 05223 return DRFLAC_FALSE; 05224 } 05225 pSubframe->wastedBitsPerSample = (drflac_uint8)wastedBitsPerSample + 1; 05226 } 05227 05228 return DRFLAC_TRUE; 05229 } 05230 05231 static drflac_bool32 drflac__decode_subframe(drflac_bs* bs, drflac_frame* frame, int subframeIndex, drflac_int32* pDecodedSamplesOut) 05232 { 05233 drflac_subframe* pSubframe; 05234 drflac_uint32 subframeBitsPerSample; 05235 05236 DRFLAC_ASSERT(bs != NULL); 05237 DRFLAC_ASSERT(frame != NULL); 05238 05239 pSubframe = frame->subframes + subframeIndex; 05240 if (!drflac__read_subframe_header(bs, pSubframe)) { 05241 return DRFLAC_FALSE; 05242 } 05243 05244 /* Side channels require an extra bit per sample. Took a while to figure that one out... */ 05245 subframeBitsPerSample = frame->header.bitsPerSample; 05246 if ((frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE || frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE) && subframeIndex == 1) { 05247 subframeBitsPerSample += 1; 05248 } else if (frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE && subframeIndex == 0) { 05249 subframeBitsPerSample += 1; 05250 } 05251 05252 /* Need to handle wasted bits per sample. */ 05253 if (pSubframe->wastedBitsPerSample >= subframeBitsPerSample) { 05254 return DRFLAC_FALSE; 05255 } 05256 subframeBitsPerSample -= pSubframe->wastedBitsPerSample; 05257 05258 pSubframe->pSamplesS32 = pDecodedSamplesOut; 05259 05260 switch (pSubframe->subframeType) 05261 { 05262 case DRFLAC_SUBFRAME_CONSTANT: 05263 { 05264 drflac__decode_samples__constant(bs, frame->header.blockSizeInPCMFrames, subframeBitsPerSample, pSubframe->pSamplesS32); 05265 } break; 05266 05267 case DRFLAC_SUBFRAME_VERBATIM: 05268 { 05269 drflac__decode_samples__verbatim(bs, frame->header.blockSizeInPCMFrames, subframeBitsPerSample, pSubframe->pSamplesS32); 05270 } break; 05271 05272 case DRFLAC_SUBFRAME_FIXED: 05273 { 05274 drflac__decode_samples__fixed(bs, frame->header.blockSizeInPCMFrames, subframeBitsPerSample, pSubframe->lpcOrder, pSubframe->pSamplesS32); 05275 } break; 05276 05277 case DRFLAC_SUBFRAME_LPC: 05278 { 05279 drflac__decode_samples__lpc(bs, frame->header.blockSizeInPCMFrames, subframeBitsPerSample, pSubframe->lpcOrder, pSubframe->pSamplesS32); 05280 } break; 05281 05282 default: return DRFLAC_FALSE; 05283 } 05284 05285 return DRFLAC_TRUE; 05286 } 05287 05288 static drflac_bool32 drflac__seek_subframe(drflac_bs* bs, drflac_frame* frame, int subframeIndex) 05289 { 05290 drflac_subframe* pSubframe; 05291 drflac_uint32 subframeBitsPerSample; 05292 05293 DRFLAC_ASSERT(bs != NULL); 05294 DRFLAC_ASSERT(frame != NULL); 05295 05296 pSubframe = frame->subframes + subframeIndex; 05297 if (!drflac__read_subframe_header(bs, pSubframe)) { 05298 return DRFLAC_FALSE; 05299 } 05300 05301 /* Side channels require an extra bit per sample. Took a while to figure that one out... */ 05302 subframeBitsPerSample = frame->header.bitsPerSample; 05303 if ((frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE || frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE) && subframeIndex == 1) { 05304 subframeBitsPerSample += 1; 05305 } else if (frame->header.channelAssignment == DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE && subframeIndex == 0) { 05306 subframeBitsPerSample += 1; 05307 } 05308 05309 /* Need to handle wasted bits per sample. */ 05310 if (pSubframe->wastedBitsPerSample >= subframeBitsPerSample) { 05311 return DRFLAC_FALSE; 05312 } 05313 subframeBitsPerSample -= pSubframe->wastedBitsPerSample; 05314 05315 pSubframe->pSamplesS32 = NULL; 05316 05317 switch (pSubframe->subframeType) 05318 { 05319 case DRFLAC_SUBFRAME_CONSTANT: 05320 { 05321 if (!drflac__seek_bits(bs, subframeBitsPerSample)) { 05322 return DRFLAC_FALSE; 05323 } 05324 } break; 05325 05326 case DRFLAC_SUBFRAME_VERBATIM: 05327 { 05328 unsigned int bitsToSeek = frame->header.blockSizeInPCMFrames * subframeBitsPerSample; 05329 if (!drflac__seek_bits(bs, bitsToSeek)) { 05330 return DRFLAC_FALSE; 05331 } 05332 } break; 05333 05334 case DRFLAC_SUBFRAME_FIXED: 05335 { 05336 unsigned int bitsToSeek = pSubframe->lpcOrder * subframeBitsPerSample; 05337 if (!drflac__seek_bits(bs, bitsToSeek)) { 05338 return DRFLAC_FALSE; 05339 } 05340 05341 if (!drflac__read_and_seek_residual(bs, frame->header.blockSizeInPCMFrames, pSubframe->lpcOrder)) { 05342 return DRFLAC_FALSE; 05343 } 05344 } break; 05345 05346 case DRFLAC_SUBFRAME_LPC: 05347 { 05348 drflac_uint8 lpcPrecision; 05349 05350 unsigned int bitsToSeek = pSubframe->lpcOrder * subframeBitsPerSample; 05351 if (!drflac__seek_bits(bs, bitsToSeek)) { 05352 return DRFLAC_FALSE; 05353 } 05354 05355 if (!drflac__read_uint8(bs, 4, &lpcPrecision)) { 05356 return DRFLAC_FALSE; 05357 } 05358 if (lpcPrecision == 15) { 05359 return DRFLAC_FALSE; /* Invalid. */ 05360 } 05361 lpcPrecision += 1; 05362 05363 05364 bitsToSeek = (pSubframe->lpcOrder * lpcPrecision) + 5; /* +5 for shift. */ 05365 if (!drflac__seek_bits(bs, bitsToSeek)) { 05366 return DRFLAC_FALSE; 05367 } 05368 05369 if (!drflac__read_and_seek_residual(bs, frame->header.blockSizeInPCMFrames, pSubframe->lpcOrder)) { 05370 return DRFLAC_FALSE; 05371 } 05372 } break; 05373 05374 default: return DRFLAC_FALSE; 05375 } 05376 05377 return DRFLAC_TRUE; 05378 } 05379 05380 05381 static DRFLAC_INLINE drflac_uint8 drflac__get_channel_count_from_channel_assignment(drflac_int8 channelAssignment) 05382 { 05383 drflac_uint8 lookup[] = {1, 2, 3, 4, 5, 6, 7, 8, 2, 2, 2}; 05384 05385 DRFLAC_ASSERT(channelAssignment <= 10); 05386 return lookup[channelAssignment]; 05387 } 05388 05389 static drflac_result drflac__decode_flac_frame(drflac* pFlac) 05390 { 05391 int channelCount; 05392 int i; 05393 drflac_uint8 paddingSizeInBits; 05394 drflac_uint16 desiredCRC16; 05395 #ifndef DR_FLAC_NO_CRC 05396 drflac_uint16 actualCRC16; 05397 #endif 05398 05399 /* This function should be called while the stream is sitting on the first byte after the frame header. */ 05400 DRFLAC_ZERO_MEMORY(pFlac->currentFLACFrame.subframes, sizeof(pFlac->currentFLACFrame.subframes)); 05401 05402 /* The frame block size must never be larger than the maximum block size defined by the FLAC stream. */ 05403 if (pFlac->currentFLACFrame.header.blockSizeInPCMFrames > pFlac->maxBlockSizeInPCMFrames) { 05404 return DRFLAC_ERROR; 05405 } 05406 05407 /* The number of channels in the frame must match the channel count from the STREAMINFO block. */ 05408 channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment); 05409 if (channelCount != (int)pFlac->channels) { 05410 return DRFLAC_ERROR; 05411 } 05412 05413 for (i = 0; i < channelCount; ++i) { 05414 if (!drflac__decode_subframe(&pFlac->bs, &pFlac->currentFLACFrame, i, pFlac->pDecodedSamples + (pFlac->currentFLACFrame.header.blockSizeInPCMFrames * i))) { 05415 return DRFLAC_ERROR; 05416 } 05417 } 05418 05419 paddingSizeInBits = (drflac_uint8)(DRFLAC_CACHE_L1_BITS_REMAINING(&pFlac->bs) & 7); 05420 if (paddingSizeInBits > 0) { 05421 drflac_uint8 padding = 0; 05422 if (!drflac__read_uint8(&pFlac->bs, paddingSizeInBits, &padding)) { 05423 return DRFLAC_AT_END; 05424 } 05425 } 05426 05427 #ifndef DR_FLAC_NO_CRC 05428 actualCRC16 = drflac__flush_crc16(&pFlac->bs); 05429 #endif 05430 if (!drflac__read_uint16(&pFlac->bs, 16, &desiredCRC16)) { 05431 return DRFLAC_AT_END; 05432 } 05433 05434 #ifndef DR_FLAC_NO_CRC 05435 if (actualCRC16 != desiredCRC16) { 05436 return DRFLAC_CRC_MISMATCH; /* CRC mismatch. */ 05437 } 05438 #endif 05439 05440 pFlac->currentFLACFrame.pcmFramesRemaining = pFlac->currentFLACFrame.header.blockSizeInPCMFrames; 05441 05442 return DRFLAC_SUCCESS; 05443 } 05444 05445 static drflac_result drflac__seek_flac_frame(drflac* pFlac) 05446 { 05447 int channelCount; 05448 int i; 05449 drflac_uint16 desiredCRC16; 05450 #ifndef DR_FLAC_NO_CRC 05451 drflac_uint16 actualCRC16; 05452 #endif 05453 05454 channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment); 05455 for (i = 0; i < channelCount; ++i) { 05456 if (!drflac__seek_subframe(&pFlac->bs, &pFlac->currentFLACFrame, i)) { 05457 return DRFLAC_ERROR; 05458 } 05459 } 05460 05461 /* Padding. */ 05462 if (!drflac__seek_bits(&pFlac->bs, DRFLAC_CACHE_L1_BITS_REMAINING(&pFlac->bs) & 7)) { 05463 return DRFLAC_ERROR; 05464 } 05465 05466 /* CRC. */ 05467 #ifndef DR_FLAC_NO_CRC 05468 actualCRC16 = drflac__flush_crc16(&pFlac->bs); 05469 #endif 05470 if (!drflac__read_uint16(&pFlac->bs, 16, &desiredCRC16)) { 05471 return DRFLAC_AT_END; 05472 } 05473 05474 #ifndef DR_FLAC_NO_CRC 05475 if (actualCRC16 != desiredCRC16) { 05476 return DRFLAC_CRC_MISMATCH; /* CRC mismatch. */ 05477 } 05478 #endif 05479 05480 return DRFLAC_SUCCESS; 05481 } 05482 05483 static drflac_bool32 drflac__read_and_decode_next_flac_frame(drflac* pFlac) 05484 { 05485 DRFLAC_ASSERT(pFlac != NULL); 05486 05487 for (;;) { 05488 drflac_result result; 05489 05490 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 05491 return DRFLAC_FALSE; 05492 } 05493 05494 result = drflac__decode_flac_frame(pFlac); 05495 if (result != DRFLAC_SUCCESS) { 05496 if (result == DRFLAC_CRC_MISMATCH) { 05497 continue; /* CRC mismatch. Skip to the next frame. */ 05498 } else { 05499 return DRFLAC_FALSE; 05500 } 05501 } 05502 05503 return DRFLAC_TRUE; 05504 } 05505 } 05506 05507 static void drflac__get_pcm_frame_range_of_current_flac_frame(drflac* pFlac, drflac_uint64* pFirstPCMFrame, drflac_uint64* pLastPCMFrame) 05508 { 05509 drflac_uint64 firstPCMFrame; 05510 drflac_uint64 lastPCMFrame; 05511 05512 DRFLAC_ASSERT(pFlac != NULL); 05513 05514 firstPCMFrame = pFlac->currentFLACFrame.header.pcmFrameNumber; 05515 if (firstPCMFrame == 0) { 05516 firstPCMFrame = ((drflac_uint64)pFlac->currentFLACFrame.header.flacFrameNumber) * pFlac->maxBlockSizeInPCMFrames; 05517 } 05518 05519 lastPCMFrame = firstPCMFrame + pFlac->currentFLACFrame.header.blockSizeInPCMFrames; 05520 if (lastPCMFrame > 0) { 05521 lastPCMFrame -= 1; /* Needs to be zero based. */ 05522 } 05523 05524 if (pFirstPCMFrame) { 05525 *pFirstPCMFrame = firstPCMFrame; 05526 } 05527 if (pLastPCMFrame) { 05528 *pLastPCMFrame = lastPCMFrame; 05529 } 05530 } 05531 05532 static drflac_bool32 drflac__seek_to_first_frame(drflac* pFlac) 05533 { 05534 drflac_bool32 result; 05535 05536 DRFLAC_ASSERT(pFlac != NULL); 05537 05538 result = drflac__seek_to_byte(&pFlac->bs, pFlac->firstFLACFramePosInBytes); 05539 05540 DRFLAC_ZERO_MEMORY(&pFlac->currentFLACFrame, sizeof(pFlac->currentFLACFrame)); 05541 pFlac->currentPCMFrame = 0; 05542 05543 return result; 05544 } 05545 05546 static DRFLAC_INLINE drflac_result drflac__seek_to_next_flac_frame(drflac* pFlac) 05547 { 05548 /* This function should only ever be called while the decoder is sitting on the first byte past the FRAME_HEADER section. */ 05549 DRFLAC_ASSERT(pFlac != NULL); 05550 return drflac__seek_flac_frame(pFlac); 05551 } 05552 05553 05554 static drflac_uint64 drflac__seek_forward_by_pcm_frames(drflac* pFlac, drflac_uint64 pcmFramesToSeek) 05555 { 05556 drflac_uint64 pcmFramesRead = 0; 05557 while (pcmFramesToSeek > 0) { 05558 if (pFlac->currentFLACFrame.pcmFramesRemaining == 0) { 05559 if (!drflac__read_and_decode_next_flac_frame(pFlac)) { 05560 break; /* Couldn't read the next frame, so just break from the loop and return. */ 05561 } 05562 } else { 05563 if (pFlac->currentFLACFrame.pcmFramesRemaining > pcmFramesToSeek) { 05564 pcmFramesRead += pcmFramesToSeek; 05565 pFlac->currentFLACFrame.pcmFramesRemaining -= (drflac_uint32)pcmFramesToSeek; /* <-- Safe cast. Will always be < currentFrame.pcmFramesRemaining < 65536. */ 05566 pcmFramesToSeek = 0; 05567 } else { 05568 pcmFramesRead += pFlac->currentFLACFrame.pcmFramesRemaining; 05569 pcmFramesToSeek -= pFlac->currentFLACFrame.pcmFramesRemaining; 05570 pFlac->currentFLACFrame.pcmFramesRemaining = 0; 05571 } 05572 } 05573 } 05574 05575 pFlac->currentPCMFrame += pcmFramesRead; 05576 return pcmFramesRead; 05577 } 05578 05579 05580 static drflac_bool32 drflac__seek_to_pcm_frame__brute_force(drflac* pFlac, drflac_uint64 pcmFrameIndex) 05581 { 05582 drflac_bool32 isMidFrame = DRFLAC_FALSE; 05583 drflac_uint64 runningPCMFrameCount; 05584 05585 DRFLAC_ASSERT(pFlac != NULL); 05586 05587 /* If we are seeking forward we start from the current position. Otherwise we need to start all the way from the start of the file. */ 05588 if (pcmFrameIndex >= pFlac->currentPCMFrame) { 05589 /* Seeking forward. Need to seek from the current position. */ 05590 runningPCMFrameCount = pFlac->currentPCMFrame; 05591 05592 /* The frame header for the first frame may not yet have been read. We need to do that if necessary. */ 05593 if (pFlac->currentPCMFrame == 0 && pFlac->currentFLACFrame.pcmFramesRemaining == 0) { 05594 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 05595 return DRFLAC_FALSE; 05596 } 05597 } else { 05598 isMidFrame = DRFLAC_TRUE; 05599 } 05600 } else { 05601 /* Seeking backwards. Need to seek from the start of the file. */ 05602 runningPCMFrameCount = 0; 05603 05604 /* Move back to the start. */ 05605 if (!drflac__seek_to_first_frame(pFlac)) { 05606 return DRFLAC_FALSE; 05607 } 05608 05609 /* Decode the first frame in preparation for sample-exact seeking below. */ 05610 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 05611 return DRFLAC_FALSE; 05612 } 05613 } 05614 05615 /* 05616 We need to as quickly as possible find the frame that contains the target sample. To do this, we iterate over each frame and inspect its 05617 header. If based on the header we can determine that the frame contains the sample, we do a full decode of that frame. 05618 */ 05619 for (;;) { 05620 drflac_uint64 pcmFrameCountInThisFLACFrame; 05621 drflac_uint64 firstPCMFrameInFLACFrame = 0; 05622 drflac_uint64 lastPCMFrameInFLACFrame = 0; 05623 05624 drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &firstPCMFrameInFLACFrame, &lastPCMFrameInFLACFrame); 05625 05626 pcmFrameCountInThisFLACFrame = (lastPCMFrameInFLACFrame - firstPCMFrameInFLACFrame) + 1; 05627 if (pcmFrameIndex < (runningPCMFrameCount + pcmFrameCountInThisFLACFrame)) { 05628 /* 05629 The sample should be in this frame. We need to fully decode it, however if it's an invalid frame (a CRC mismatch), we need to pretend 05630 it never existed and keep iterating. 05631 */ 05632 drflac_uint64 pcmFramesToDecode = pcmFrameIndex - runningPCMFrameCount; 05633 05634 if (!isMidFrame) { 05635 drflac_result result = drflac__decode_flac_frame(pFlac); 05636 if (result == DRFLAC_SUCCESS) { 05637 /* The frame is valid. We just need to skip over some samples to ensure it's sample-exact. */ 05638 return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode; /* <-- If this fails, something bad has happened (it should never fail). */ 05639 } else { 05640 if (result == DRFLAC_CRC_MISMATCH) { 05641 goto next_iteration; /* CRC mismatch. Pretend this frame never existed. */ 05642 } else { 05643 return DRFLAC_FALSE; 05644 } 05645 } 05646 } else { 05647 /* We started seeking mid-frame which means we need to skip the frame decoding part. */ 05648 return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode; 05649 } 05650 } else { 05651 /* 05652 It's not in this frame. We need to seek past the frame, but check if there was a CRC mismatch. If so, we pretend this 05653 frame never existed and leave the running sample count untouched. 05654 */ 05655 if (!isMidFrame) { 05656 drflac_result result = drflac__seek_to_next_flac_frame(pFlac); 05657 if (result == DRFLAC_SUCCESS) { 05658 runningPCMFrameCount += pcmFrameCountInThisFLACFrame; 05659 } else { 05660 if (result == DRFLAC_CRC_MISMATCH) { 05661 goto next_iteration; /* CRC mismatch. Pretend this frame never existed. */ 05662 } else { 05663 return DRFLAC_FALSE; 05664 } 05665 } 05666 } else { 05667 /* 05668 We started seeking mid-frame which means we need to seek by reading to the end of the frame instead of with 05669 drflac__seek_to_next_flac_frame() which only works if the decoder is sitting on the byte just after the frame header. 05670 */ 05671 runningPCMFrameCount += pFlac->currentFLACFrame.pcmFramesRemaining; 05672 pFlac->currentFLACFrame.pcmFramesRemaining = 0; 05673 isMidFrame = DRFLAC_FALSE; 05674 } 05675 05676 /* If we are seeking to the end of the file and we've just hit it, we're done. */ 05677 if (pcmFrameIndex == pFlac->totalPCMFrameCount && runningPCMFrameCount == pFlac->totalPCMFrameCount) { 05678 return DRFLAC_TRUE; 05679 } 05680 } 05681 05682 next_iteration: 05683 /* Grab the next frame in preparation for the next iteration. */ 05684 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 05685 return DRFLAC_FALSE; 05686 } 05687 } 05688 } 05689 05690 05691 #if !defined(DR_FLAC_NO_CRC) 05692 /* 05693 We use an average compression ratio to determine our approximate start location. FLAC files are generally about 50%-70% the size of their 05694 uncompressed counterparts so we'll use this as a basis. I'm going to split the middle and use a factor of 0.6 to determine the starting 05695 location. 05696 */ 05697 #define DRFLAC_BINARY_SEARCH_APPROX_COMPRESSION_RATIO 0.6f 05698 05699 static drflac_bool32 drflac__seek_to_approximate_flac_frame_to_byte(drflac* pFlac, drflac_uint64 targetByte, drflac_uint64 rangeLo, drflac_uint64 rangeHi, drflac_uint64* pLastSuccessfulSeekOffset) 05700 { 05701 DRFLAC_ASSERT(pFlac != NULL); 05702 DRFLAC_ASSERT(pLastSuccessfulSeekOffset != NULL); 05703 DRFLAC_ASSERT(targetByte >= rangeLo); 05704 DRFLAC_ASSERT(targetByte <= rangeHi); 05705 05706 *pLastSuccessfulSeekOffset = pFlac->firstFLACFramePosInBytes; 05707 05708 for (;;) { 05709 /* When seeking to a byte, failure probably means we've attempted to seek beyond the end of the stream. To counter this we just halve it each attempt. */ 05710 if (!drflac__seek_to_byte(&pFlac->bs, targetByte)) { 05711 /* If we couldn't even seek to the first byte in the stream we have a problem. Just abandon the whole thing. */ 05712 if (targetByte == 0) { 05713 drflac__seek_to_first_frame(pFlac); /* Try to recover. */ 05714 return DRFLAC_FALSE; 05715 } 05716 05717 /* Halve the byte location and continue. */ 05718 targetByte = rangeLo + ((rangeHi - rangeLo)/2); 05719 rangeHi = targetByte; 05720 } else { 05721 /* Getting here should mean that we have seeked to an appropriate byte. */ 05722 05723 /* Clear the details of the FLAC frame so we don't misreport data. */ 05724 DRFLAC_ZERO_MEMORY(&pFlac->currentFLACFrame, sizeof(pFlac->currentFLACFrame)); 05725 05726 /* 05727 Now seek to the next FLAC frame. We need to decode the entire frame (not just the header) because it's possible for the header to incorrectly pass the 05728 CRC check and return bad data. We need to decode the entire frame to be more certain. Although this seems unlikely, this has happened to me in testing 05729 so it needs to stay this way for now. 05730 */ 05731 #if 1 05732 if (!drflac__read_and_decode_next_flac_frame(pFlac)) { 05733 /* Halve the byte location and continue. */ 05734 targetByte = rangeLo + ((rangeHi - rangeLo)/2); 05735 rangeHi = targetByte; 05736 } else { 05737 break; 05738 } 05739 #else 05740 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 05741 /* Halve the byte location and continue. */ 05742 targetByte = rangeLo + ((rangeHi - rangeLo)/2); 05743 rangeHi = targetByte; 05744 } else { 05745 break; 05746 } 05747 #endif 05748 } 05749 } 05750 05751 /* The current PCM frame needs to be updated based on the frame we just seeked to. */ 05752 drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &pFlac->currentPCMFrame, NULL); 05753 05754 DRFLAC_ASSERT(targetByte <= rangeHi); 05755 05756 *pLastSuccessfulSeekOffset = targetByte; 05757 return DRFLAC_TRUE; 05758 } 05759 05760 static drflac_bool32 drflac__decode_flac_frame_and_seek_forward_by_pcm_frames(drflac* pFlac, drflac_uint64 offset) 05761 { 05762 /* This section of code would be used if we were only decoding the FLAC frame header when calling drflac__seek_to_approximate_flac_frame_to_byte(). */ 05763 #if 0 05764 if (drflac__decode_flac_frame(pFlac) != DRFLAC_SUCCESS) { 05765 /* We failed to decode this frame which may be due to it being corrupt. We'll just use the next valid FLAC frame. */ 05766 if (drflac__read_and_decode_next_flac_frame(pFlac) == DRFLAC_FALSE) { 05767 return DRFLAC_FALSE; 05768 } 05769 } 05770 #endif 05771 05772 return drflac__seek_forward_by_pcm_frames(pFlac, offset) == offset; 05773 } 05774 05775 05776 static drflac_bool32 drflac__seek_to_pcm_frame__binary_search_internal(drflac* pFlac, drflac_uint64 pcmFrameIndex, drflac_uint64 byteRangeLo, drflac_uint64 byteRangeHi) 05777 { 05778 /* This assumes pFlac->currentPCMFrame is sitting on byteRangeLo upon entry. */ 05779 05780 drflac_uint64 targetByte; 05781 drflac_uint64 pcmRangeLo = pFlac->totalPCMFrameCount; 05782 drflac_uint64 pcmRangeHi = 0; 05783 drflac_uint64 lastSuccessfulSeekOffset = (drflac_uint64)-1; 05784 drflac_uint64 closestSeekOffsetBeforeTargetPCMFrame = byteRangeLo; 05785 drflac_uint32 seekForwardThreshold = (pFlac->maxBlockSizeInPCMFrames != 0) ? pFlac->maxBlockSizeInPCMFrames*2 : 4096; 05786 05787 targetByte = byteRangeLo + (drflac_uint64)(((drflac_int64)((pcmFrameIndex - pFlac->currentPCMFrame) * pFlac->channels * pFlac->bitsPerSample)/8.0f) * DRFLAC_BINARY_SEARCH_APPROX_COMPRESSION_RATIO); 05788 if (targetByte > byteRangeHi) { 05789 targetByte = byteRangeHi; 05790 } 05791 05792 for (;;) { 05793 if (drflac__seek_to_approximate_flac_frame_to_byte(pFlac, targetByte, byteRangeLo, byteRangeHi, &lastSuccessfulSeekOffset)) { 05794 /* We found a FLAC frame. We need to check if it contains the sample we're looking for. */ 05795 drflac_uint64 newPCMRangeLo; 05796 drflac_uint64 newPCMRangeHi; 05797 drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &newPCMRangeLo, &newPCMRangeHi); 05798 05799 /* If we selected the same frame, it means we should be pretty close. Just decode the rest. */ 05800 if (pcmRangeLo == newPCMRangeLo) { 05801 if (!drflac__seek_to_approximate_flac_frame_to_byte(pFlac, closestSeekOffsetBeforeTargetPCMFrame, closestSeekOffsetBeforeTargetPCMFrame, byteRangeHi, &lastSuccessfulSeekOffset)) { 05802 break; /* Failed to seek to closest frame. */ 05803 } 05804 05805 if (drflac__decode_flac_frame_and_seek_forward_by_pcm_frames(pFlac, pcmFrameIndex - pFlac->currentPCMFrame)) { 05806 return DRFLAC_TRUE; 05807 } else { 05808 break; /* Failed to seek forward. */ 05809 } 05810 } 05811 05812 pcmRangeLo = newPCMRangeLo; 05813 pcmRangeHi = newPCMRangeHi; 05814 05815 if (pcmRangeLo <= pcmFrameIndex && pcmRangeHi >= pcmFrameIndex) { 05816 /* The target PCM frame is in this FLAC frame. */ 05817 if (drflac__decode_flac_frame_and_seek_forward_by_pcm_frames(pFlac, pcmFrameIndex - pFlac->currentPCMFrame) ) { 05818 return DRFLAC_TRUE; 05819 } else { 05820 break; /* Failed to seek to FLAC frame. */ 05821 } 05822 } else { 05823 const float approxCompressionRatio = (drflac_int64)(lastSuccessfulSeekOffset - pFlac->firstFLACFramePosInBytes) / ((drflac_int64)(pcmRangeLo * pFlac->channels * pFlac->bitsPerSample)/8.0f); 05824 05825 if (pcmRangeLo > pcmFrameIndex) { 05826 /* We seeked too far forward. We need to move our target byte backward and try again. */ 05827 byteRangeHi = lastSuccessfulSeekOffset; 05828 if (byteRangeLo > byteRangeHi) { 05829 byteRangeLo = byteRangeHi; 05830 } 05831 05832 targetByte = byteRangeLo + ((byteRangeHi - byteRangeLo) / 2); 05833 if (targetByte < byteRangeLo) { 05834 targetByte = byteRangeLo; 05835 } 05836 } else /*if (pcmRangeHi < pcmFrameIndex)*/ { 05837 /* We didn't seek far enough. We need to move our target byte forward and try again. */ 05838 05839 /* If we're close enough we can just seek forward. */ 05840 if ((pcmFrameIndex - pcmRangeLo) < seekForwardThreshold) { 05841 if (drflac__decode_flac_frame_and_seek_forward_by_pcm_frames(pFlac, pcmFrameIndex - pFlac->currentPCMFrame)) { 05842 return DRFLAC_TRUE; 05843 } else { 05844 break; /* Failed to seek to FLAC frame. */ 05845 } 05846 } else { 05847 byteRangeLo = lastSuccessfulSeekOffset; 05848 if (byteRangeHi < byteRangeLo) { 05849 byteRangeHi = byteRangeLo; 05850 } 05851 05852 targetByte = lastSuccessfulSeekOffset + (drflac_uint64)(((drflac_int64)((pcmFrameIndex-pcmRangeLo) * pFlac->channels * pFlac->bitsPerSample)/8.0f) * approxCompressionRatio); 05853 if (targetByte > byteRangeHi) { 05854 targetByte = byteRangeHi; 05855 } 05856 05857 if (closestSeekOffsetBeforeTargetPCMFrame < lastSuccessfulSeekOffset) { 05858 closestSeekOffsetBeforeTargetPCMFrame = lastSuccessfulSeekOffset; 05859 } 05860 } 05861 } 05862 } 05863 } else { 05864 /* Getting here is really bad. We just recover as best we can, but moving to the first frame in the stream, and then abort. */ 05865 break; 05866 } 05867 } 05868 05869 drflac__seek_to_first_frame(pFlac); /* <-- Try to recover. */ 05870 return DRFLAC_FALSE; 05871 } 05872 05873 static drflac_bool32 drflac__seek_to_pcm_frame__binary_search(drflac* pFlac, drflac_uint64 pcmFrameIndex) 05874 { 05875 drflac_uint64 byteRangeLo; 05876 drflac_uint64 byteRangeHi; 05877 drflac_uint32 seekForwardThreshold = (pFlac->maxBlockSizeInPCMFrames != 0) ? pFlac->maxBlockSizeInPCMFrames*2 : 4096; 05878 05879 /* Our algorithm currently assumes the FLAC stream is currently sitting at the start. */ 05880 if (drflac__seek_to_first_frame(pFlac) == DRFLAC_FALSE) { 05881 return DRFLAC_FALSE; 05882 } 05883 05884 /* If we're close enough to the start, just move to the start and seek forward. */ 05885 if (pcmFrameIndex < seekForwardThreshold) { 05886 return drflac__seek_forward_by_pcm_frames(pFlac, pcmFrameIndex) == pcmFrameIndex; 05887 } 05888 05889 /* 05890 Our starting byte range is the byte position of the first FLAC frame and the approximate end of the file as if it were completely uncompressed. This ensures 05891 the entire file is included, even though most of the time it'll exceed the end of the actual stream. This is OK as the frame searching logic will handle it. 05892 */ 05893 byteRangeLo = pFlac->firstFLACFramePosInBytes; 05894 byteRangeHi = pFlac->firstFLACFramePosInBytes + (drflac_uint64)((drflac_int64)(pFlac->totalPCMFrameCount * pFlac->channels * pFlac->bitsPerSample)/8.0f); 05895 05896 return drflac__seek_to_pcm_frame__binary_search_internal(pFlac, pcmFrameIndex, byteRangeLo, byteRangeHi); 05897 } 05898 #endif /* !DR_FLAC_NO_CRC */ 05899 05900 static drflac_bool32 drflac__seek_to_pcm_frame__seek_table(drflac* pFlac, drflac_uint64 pcmFrameIndex) 05901 { 05902 drflac_uint32 iClosestSeekpoint = 0; 05903 drflac_bool32 isMidFrame = DRFLAC_FALSE; 05904 drflac_uint64 runningPCMFrameCount; 05905 drflac_uint32 iSeekpoint; 05906 05907 05908 DRFLAC_ASSERT(pFlac != NULL); 05909 05910 if (pFlac->pSeekpoints == NULL || pFlac->seekpointCount == 0) { 05911 return DRFLAC_FALSE; 05912 } 05913 05914 for (iSeekpoint = 0; iSeekpoint < pFlac->seekpointCount; ++iSeekpoint) { 05915 if (pFlac->pSeekpoints[iSeekpoint].firstPCMFrame >= pcmFrameIndex) { 05916 break; 05917 } 05918 05919 iClosestSeekpoint = iSeekpoint; 05920 } 05921 05922 /* There's been cases where the seek table contains only zeros. We need to do some basic validation on the closest seekpoint. */ 05923 if (pFlac->pSeekpoints[iClosestSeekpoint].pcmFrameCount == 0 || pFlac->pSeekpoints[iClosestSeekpoint].pcmFrameCount > pFlac->maxBlockSizeInPCMFrames) { 05924 return DRFLAC_FALSE; 05925 } 05926 if (pFlac->pSeekpoints[iClosestSeekpoint].firstPCMFrame > pFlac->totalPCMFrameCount && pFlac->totalPCMFrameCount > 0) { 05927 return DRFLAC_FALSE; 05928 } 05929 05930 #if !defined(DR_FLAC_NO_CRC) 05931 /* At this point we should know the closest seek point. We can use a binary search for this. We need to know the total sample count for this. */ 05932 if (pFlac->totalPCMFrameCount > 0) { 05933 drflac_uint64 byteRangeLo; 05934 drflac_uint64 byteRangeHi; 05935 05936 byteRangeHi = pFlac->firstFLACFramePosInBytes + (drflac_uint64)((drflac_int64)(pFlac->totalPCMFrameCount * pFlac->channels * pFlac->bitsPerSample)/8.0f); 05937 byteRangeLo = pFlac->firstFLACFramePosInBytes + pFlac->pSeekpoints[iClosestSeekpoint].flacFrameOffset; 05938 05939 /* 05940 If our closest seek point is not the last one, we only need to search between it and the next one. The section below calculates an appropriate starting 05941 value for byteRangeHi which will clamp it appropriately. 05942 05943 Note that the next seekpoint must have an offset greater than the closest seekpoint because otherwise our binary search algorithm will break down. There 05944 have been cases where a seektable consists of seek points where every byte offset is set to 0 which causes problems. If this happens we need to abort. 05945 */ 05946 if (iClosestSeekpoint < pFlac->seekpointCount-1) { 05947 drflac_uint32 iNextSeekpoint = iClosestSeekpoint + 1; 05948 05949 /* Basic validation on the seekpoints to ensure they're usable. */ 05950 if (pFlac->pSeekpoints[iClosestSeekpoint].flacFrameOffset >= pFlac->pSeekpoints[iNextSeekpoint].flacFrameOffset || pFlac->pSeekpoints[iNextSeekpoint].pcmFrameCount == 0) { 05951 return DRFLAC_FALSE; /* The next seekpoint doesn't look right. The seek table cannot be trusted from here. Abort. */ 05952 } 05953 05954 if (pFlac->pSeekpoints[iNextSeekpoint].firstPCMFrame != (((drflac_uint64)0xFFFFFFFF << 32) | 0xFFFFFFFF)) { /* Make sure it's not a placeholder seekpoint. */ 05955 byteRangeHi = pFlac->firstFLACFramePosInBytes + pFlac->pSeekpoints[iNextSeekpoint].flacFrameOffset - 1; /* byteRangeHi must be zero based. */ 05956 } 05957 } 05958 05959 if (drflac__seek_to_byte(&pFlac->bs, pFlac->firstFLACFramePosInBytes + pFlac->pSeekpoints[iClosestSeekpoint].flacFrameOffset)) { 05960 if (drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 05961 drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &pFlac->currentPCMFrame, NULL); 05962 05963 if (drflac__seek_to_pcm_frame__binary_search_internal(pFlac, pcmFrameIndex, byteRangeLo, byteRangeHi)) { 05964 return DRFLAC_TRUE; 05965 } 05966 } 05967 } 05968 } 05969 #endif /* !DR_FLAC_NO_CRC */ 05970 05971 /* Getting here means we need to use a slower algorithm because the binary search method failed or cannot be used. */ 05972 05973 /* 05974 If we are seeking forward and the closest seekpoint is _before_ the current sample, we just seek forward from where we are. Otherwise we start seeking 05975 from the seekpoint's first sample. 05976 */ 05977 if (pcmFrameIndex >= pFlac->currentPCMFrame && pFlac->pSeekpoints[iClosestSeekpoint].firstPCMFrame <= pFlac->currentPCMFrame) { 05978 /* Optimized case. Just seek forward from where we are. */ 05979 runningPCMFrameCount = pFlac->currentPCMFrame; 05980 05981 /* The frame header for the first frame may not yet have been read. We need to do that if necessary. */ 05982 if (pFlac->currentPCMFrame == 0 && pFlac->currentFLACFrame.pcmFramesRemaining == 0) { 05983 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 05984 return DRFLAC_FALSE; 05985 } 05986 } else { 05987 isMidFrame = DRFLAC_TRUE; 05988 } 05989 } else { 05990 /* Slower case. Seek to the start of the seekpoint and then seek forward from there. */ 05991 runningPCMFrameCount = pFlac->pSeekpoints[iClosestSeekpoint].firstPCMFrame; 05992 05993 if (!drflac__seek_to_byte(&pFlac->bs, pFlac->firstFLACFramePosInBytes + pFlac->pSeekpoints[iClosestSeekpoint].flacFrameOffset)) { 05994 return DRFLAC_FALSE; 05995 } 05996 05997 /* Grab the frame the seekpoint is sitting on in preparation for the sample-exact seeking below. */ 05998 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 05999 return DRFLAC_FALSE; 06000 } 06001 } 06002 06003 for (;;) { 06004 drflac_uint64 pcmFrameCountInThisFLACFrame; 06005 drflac_uint64 firstPCMFrameInFLACFrame = 0; 06006 drflac_uint64 lastPCMFrameInFLACFrame = 0; 06007 06008 drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &firstPCMFrameInFLACFrame, &lastPCMFrameInFLACFrame); 06009 06010 pcmFrameCountInThisFLACFrame = (lastPCMFrameInFLACFrame - firstPCMFrameInFLACFrame) + 1; 06011 if (pcmFrameIndex < (runningPCMFrameCount + pcmFrameCountInThisFLACFrame)) { 06012 /* 06013 The sample should be in this frame. We need to fully decode it, but if it's an invalid frame (a CRC mismatch) we need to pretend 06014 it never existed and keep iterating. 06015 */ 06016 drflac_uint64 pcmFramesToDecode = pcmFrameIndex - runningPCMFrameCount; 06017 06018 if (!isMidFrame) { 06019 drflac_result result = drflac__decode_flac_frame(pFlac); 06020 if (result == DRFLAC_SUCCESS) { 06021 /* The frame is valid. We just need to skip over some samples to ensure it's sample-exact. */ 06022 return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode; /* <-- If this fails, something bad has happened (it should never fail). */ 06023 } else { 06024 if (result == DRFLAC_CRC_MISMATCH) { 06025 goto next_iteration; /* CRC mismatch. Pretend this frame never existed. */ 06026 } else { 06027 return DRFLAC_FALSE; 06028 } 06029 } 06030 } else { 06031 /* We started seeking mid-frame which means we need to skip the frame decoding part. */ 06032 return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode; 06033 } 06034 } else { 06035 /* 06036 It's not in this frame. We need to seek past the frame, but check if there was a CRC mismatch. If so, we pretend this 06037 frame never existed and leave the running sample count untouched. 06038 */ 06039 if (!isMidFrame) { 06040 drflac_result result = drflac__seek_to_next_flac_frame(pFlac); 06041 if (result == DRFLAC_SUCCESS) { 06042 runningPCMFrameCount += pcmFrameCountInThisFLACFrame; 06043 } else { 06044 if (result == DRFLAC_CRC_MISMATCH) { 06045 goto next_iteration; /* CRC mismatch. Pretend this frame never existed. */ 06046 } else { 06047 return DRFLAC_FALSE; 06048 } 06049 } 06050 } else { 06051 /* 06052 We started seeking mid-frame which means we need to seek by reading to the end of the frame instead of with 06053 drflac__seek_to_next_flac_frame() which only works if the decoder is sitting on the byte just after the frame header. 06054 */ 06055 runningPCMFrameCount += pFlac->currentFLACFrame.pcmFramesRemaining; 06056 pFlac->currentFLACFrame.pcmFramesRemaining = 0; 06057 isMidFrame = DRFLAC_FALSE; 06058 } 06059 06060 /* If we are seeking to the end of the file and we've just hit it, we're done. */ 06061 if (pcmFrameIndex == pFlac->totalPCMFrameCount && runningPCMFrameCount == pFlac->totalPCMFrameCount) { 06062 return DRFLAC_TRUE; 06063 } 06064 } 06065 06066 next_iteration: 06067 /* Grab the next frame in preparation for the next iteration. */ 06068 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 06069 return DRFLAC_FALSE; 06070 } 06071 } 06072 } 06073 06074 06075 #ifndef DR_FLAC_NO_OGG 06076 typedef struct 06077 { 06078 drflac_uint8 capturePattern[4]; /* Should be "OggS" */ 06079 drflac_uint8 structureVersion; /* Always 0. */ 06080 drflac_uint8 headerType; 06081 drflac_uint64 granulePosition; 06082 drflac_uint32 serialNumber; 06083 drflac_uint32 sequenceNumber; 06084 drflac_uint32 checksum; 06085 drflac_uint8 segmentCount; 06086 drflac_uint8 segmentTable[255]; 06087 } drflac_ogg_page_header; 06088 #endif 06089 06090 typedef struct 06091 { 06092 drflac_read_proc onRead; 06093 drflac_seek_proc onSeek; 06094 drflac_meta_proc onMeta; 06095 drflac_container container; 06096 void* pUserData; 06097 void* pUserDataMD; 06098 drflac_uint32 sampleRate; 06099 drflac_uint8 channels; 06100 drflac_uint8 bitsPerSample; 06101 drflac_uint64 totalPCMFrameCount; 06102 drflac_uint16 maxBlockSizeInPCMFrames; 06103 drflac_uint64 runningFilePos; 06104 drflac_bool32 hasStreamInfoBlock; 06105 drflac_bool32 hasMetadataBlocks; 06106 drflac_bs bs; /* <-- A bit streamer is required for loading data during initialization. */ 06107 drflac_frame_header firstFrameHeader; /* <-- The header of the first frame that was read during relaxed initalization. Only set if there is no STREAMINFO block. */ 06108 06109 #ifndef DR_FLAC_NO_OGG 06110 drflac_uint32 oggSerial; 06111 drflac_uint64 oggFirstBytePos; 06112 drflac_ogg_page_header oggBosHeader; 06113 #endif 06114 } drflac_init_info; 06115 06116 static DRFLAC_INLINE void drflac__decode_block_header(drflac_uint32 blockHeader, drflac_uint8* isLastBlock, drflac_uint8* blockType, drflac_uint32* blockSize) 06117 { 06118 blockHeader = drflac__be2host_32(blockHeader); 06119 *isLastBlock = (drflac_uint8)((blockHeader & 0x80000000UL) >> 31); 06120 *blockType = (drflac_uint8)((blockHeader & 0x7F000000UL) >> 24); 06121 *blockSize = (blockHeader & 0x00FFFFFFUL); 06122 } 06123 06124 static DRFLAC_INLINE drflac_bool32 drflac__read_and_decode_block_header(drflac_read_proc onRead, void* pUserData, drflac_uint8* isLastBlock, drflac_uint8* blockType, drflac_uint32* blockSize) 06125 { 06126 drflac_uint32 blockHeader; 06127 06128 *blockSize = 0; 06129 if (onRead(pUserData, &blockHeader, 4) != 4) { 06130 return DRFLAC_FALSE; 06131 } 06132 06133 drflac__decode_block_header(blockHeader, isLastBlock, blockType, blockSize); 06134 return DRFLAC_TRUE; 06135 } 06136 06137 static drflac_bool32 drflac__read_streaminfo(drflac_read_proc onRead, void* pUserData, drflac_streaminfo* pStreamInfo) 06138 { 06139 drflac_uint32 blockSizes; 06140 drflac_uint64 frameSizes = 0; 06141 drflac_uint64 importantProps; 06142 drflac_uint8 md5[16]; 06143 06144 /* min/max block size. */ 06145 if (onRead(pUserData, &blockSizes, 4) != 4) { 06146 return DRFLAC_FALSE; 06147 } 06148 06149 /* min/max frame size. */ 06150 if (onRead(pUserData, &frameSizes, 6) != 6) { 06151 return DRFLAC_FALSE; 06152 } 06153 06154 /* Sample rate, channels, bits per sample and total sample count. */ 06155 if (onRead(pUserData, &importantProps, 8) != 8) { 06156 return DRFLAC_FALSE; 06157 } 06158 06159 /* MD5 */ 06160 if (onRead(pUserData, md5, sizeof(md5)) != sizeof(md5)) { 06161 return DRFLAC_FALSE; 06162 } 06163 06164 blockSizes = drflac__be2host_32(blockSizes); 06165 frameSizes = drflac__be2host_64(frameSizes); 06166 importantProps = drflac__be2host_64(importantProps); 06167 06168 pStreamInfo->minBlockSizeInPCMFrames = (drflac_uint16)((blockSizes & 0xFFFF0000) >> 16); 06169 pStreamInfo->maxBlockSizeInPCMFrames = (drflac_uint16) (blockSizes & 0x0000FFFF); 06170 pStreamInfo->minFrameSizeInPCMFrames = (drflac_uint32)((frameSizes & (((drflac_uint64)0x00FFFFFF << 16) << 24)) >> 40); 06171 pStreamInfo->maxFrameSizeInPCMFrames = (drflac_uint32)((frameSizes & (((drflac_uint64)0x00FFFFFF << 16) << 0)) >> 16); 06172 pStreamInfo->sampleRate = (drflac_uint32)((importantProps & (((drflac_uint64)0x000FFFFF << 16) << 28)) >> 44); 06173 pStreamInfo->channels = (drflac_uint8 )((importantProps & (((drflac_uint64)0x0000000E << 16) << 24)) >> 41) + 1; 06174 pStreamInfo->bitsPerSample = (drflac_uint8 )((importantProps & (((drflac_uint64)0x0000001F << 16) << 20)) >> 36) + 1; 06175 pStreamInfo->totalPCMFrameCount = ((importantProps & ((((drflac_uint64)0x0000000F << 16) << 16) | 0xFFFFFFFF))); 06176 DRFLAC_COPY_MEMORY(pStreamInfo->md5, md5, sizeof(md5)); 06177 06178 return DRFLAC_TRUE; 06179 } 06180 06181 06182 static void* drflac__malloc_default(size_t sz, void* pUserData) 06183 { 06184 (void)pUserData; 06185 return DRFLAC_MALLOC(sz); 06186 } 06187 06188 static void* drflac__realloc_default(void* p, size_t sz, void* pUserData) 06189 { 06190 (void)pUserData; 06191 return DRFLAC_REALLOC(p, sz); 06192 } 06193 06194 static void drflac__free_default(void* p, void* pUserData) 06195 { 06196 (void)pUserData; 06197 DRFLAC_FREE(p); 06198 } 06199 06200 06201 static void* drflac__malloc_from_callbacks(size_t sz, const drflac_allocation_callbacks* pAllocationCallbacks) 06202 { 06203 if (pAllocationCallbacks == NULL) { 06204 return NULL; 06205 } 06206 06207 if (pAllocationCallbacks->onMalloc != NULL) { 06208 return pAllocationCallbacks->onMalloc(sz, pAllocationCallbacks->pUserData); 06209 } 06210 06211 /* Try using realloc(). */ 06212 if (pAllocationCallbacks->onRealloc != NULL) { 06213 return pAllocationCallbacks->onRealloc(NULL, sz, pAllocationCallbacks->pUserData); 06214 } 06215 06216 return NULL; 06217 } 06218 06219 static void* drflac__realloc_from_callbacks(void* p, size_t szNew, size_t szOld, const drflac_allocation_callbacks* pAllocationCallbacks) 06220 { 06221 if (pAllocationCallbacks == NULL) { 06222 return NULL; 06223 } 06224 06225 if (pAllocationCallbacks->onRealloc != NULL) { 06226 return pAllocationCallbacks->onRealloc(p, szNew, pAllocationCallbacks->pUserData); 06227 } 06228 06229 /* Try emulating realloc() in terms of malloc()/free(). */ 06230 if (pAllocationCallbacks->onMalloc != NULL && pAllocationCallbacks->onFree != NULL) { 06231 void* p2; 06232 06233 p2 = pAllocationCallbacks->onMalloc(szNew, pAllocationCallbacks->pUserData); 06234 if (p2 == NULL) { 06235 return NULL; 06236 } 06237 06238 if (p != NULL) { 06239 DRFLAC_COPY_MEMORY(p2, p, szOld); 06240 pAllocationCallbacks->onFree(p, pAllocationCallbacks->pUserData); 06241 } 06242 06243 return p2; 06244 } 06245 06246 return NULL; 06247 } 06248 06249 static void drflac__free_from_callbacks(void* p, const drflac_allocation_callbacks* pAllocationCallbacks) 06250 { 06251 if (p == NULL || pAllocationCallbacks == NULL) { 06252 return; 06253 } 06254 06255 if (pAllocationCallbacks->onFree != NULL) { 06256 pAllocationCallbacks->onFree(p, pAllocationCallbacks->pUserData); 06257 } 06258 } 06259 06260 06261 static drflac_bool32 drflac__read_and_decode_metadata(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, void* pUserDataMD, drflac_uint64* pFirstFramePos, drflac_uint64* pSeektablePos, drflac_uint32* pSeektableSize, drflac_allocation_callbacks* pAllocationCallbacks) 06262 { 06263 /* 06264 We want to keep track of the byte position in the stream of the seektable. At the time of calling this function we know that 06265 we'll be sitting on byte 42. 06266 */ 06267 drflac_uint64 runningFilePos = 42; 06268 drflac_uint64 seektablePos = 0; 06269 drflac_uint32 seektableSize = 0; 06270 06271 for (;;) { 06272 drflac_metadata metadata; 06273 drflac_uint8 isLastBlock = 0; 06274 drflac_uint8 blockType; 06275 drflac_uint32 blockSize; 06276 if (drflac__read_and_decode_block_header(onRead, pUserData, &isLastBlock, &blockType, &blockSize) == DRFLAC_FALSE) { 06277 return DRFLAC_FALSE; 06278 } 06279 runningFilePos += 4; 06280 06281 metadata.type = blockType; 06282 metadata.pRawData = NULL; 06283 metadata.rawDataSize = 0; 06284 06285 switch (blockType) 06286 { 06287 case DRFLAC_METADATA_BLOCK_TYPE_APPLICATION: 06288 { 06289 if (blockSize < 4) { 06290 return DRFLAC_FALSE; 06291 } 06292 06293 if (onMeta) { 06294 void* pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks); 06295 if (pRawData == NULL) { 06296 return DRFLAC_FALSE; 06297 } 06298 06299 if (onRead(pUserData, pRawData, blockSize) != blockSize) { 06300 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06301 return DRFLAC_FALSE; 06302 } 06303 06304 metadata.pRawData = pRawData; 06305 metadata.rawDataSize = blockSize; 06306 metadata.data.application.id = drflac__be2host_32(*(drflac_uint32*)pRawData); 06307 metadata.data.application.pData = (const void*)((drflac_uint8*)pRawData + sizeof(drflac_uint32)); 06308 metadata.data.application.dataSize = blockSize - sizeof(drflac_uint32); 06309 onMeta(pUserDataMD, &metadata); 06310 06311 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06312 } 06313 } break; 06314 06315 case DRFLAC_METADATA_BLOCK_TYPE_SEEKTABLE: 06316 { 06317 seektablePos = runningFilePos; 06318 seektableSize = blockSize; 06319 06320 if (onMeta) { 06321 drflac_uint32 iSeekpoint; 06322 void* pRawData; 06323 06324 pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks); 06325 if (pRawData == NULL) { 06326 return DRFLAC_FALSE; 06327 } 06328 06329 if (onRead(pUserData, pRawData, blockSize) != blockSize) { 06330 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06331 return DRFLAC_FALSE; 06332 } 06333 06334 metadata.pRawData = pRawData; 06335 metadata.rawDataSize = blockSize; 06336 metadata.data.seektable.seekpointCount = blockSize/sizeof(drflac_seekpoint); 06337 metadata.data.seektable.pSeekpoints = (const drflac_seekpoint*)pRawData; 06338 06339 /* Endian swap. */ 06340 for (iSeekpoint = 0; iSeekpoint < metadata.data.seektable.seekpointCount; ++iSeekpoint) { 06341 drflac_seekpoint* pSeekpoint = (drflac_seekpoint*)pRawData + iSeekpoint; 06342 pSeekpoint->firstPCMFrame = drflac__be2host_64(pSeekpoint->firstPCMFrame); 06343 pSeekpoint->flacFrameOffset = drflac__be2host_64(pSeekpoint->flacFrameOffset); 06344 pSeekpoint->pcmFrameCount = drflac__be2host_16(pSeekpoint->pcmFrameCount); 06345 } 06346 06347 onMeta(pUserDataMD, &metadata); 06348 06349 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06350 } 06351 } break; 06352 06353 case DRFLAC_METADATA_BLOCK_TYPE_VORBIS_COMMENT: 06354 { 06355 if (blockSize < 8) { 06356 return DRFLAC_FALSE; 06357 } 06358 06359 if (onMeta) { 06360 void* pRawData; 06361 const char* pRunningData; 06362 const char* pRunningDataEnd; 06363 drflac_uint32 i; 06364 06365 pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks); 06366 if (pRawData == NULL) { 06367 return DRFLAC_FALSE; 06368 } 06369 06370 if (onRead(pUserData, pRawData, blockSize) != blockSize) { 06371 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06372 return DRFLAC_FALSE; 06373 } 06374 06375 metadata.pRawData = pRawData; 06376 metadata.rawDataSize = blockSize; 06377 06378 pRunningData = (const char*)pRawData; 06379 pRunningDataEnd = (const char*)pRawData + blockSize; 06380 06381 metadata.data.vorbis_comment.vendorLength = drflac__le2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 06382 06383 /* Need space for the rest of the block */ 06384 if ((pRunningDataEnd - pRunningData) - 4 < (drflac_int64)metadata.data.vorbis_comment.vendorLength) { /* <-- Note the order of operations to avoid overflow to a valid value */ 06385 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06386 return DRFLAC_FALSE; 06387 } 06388 metadata.data.vorbis_comment.vendor = pRunningData; pRunningData += metadata.data.vorbis_comment.vendorLength; 06389 metadata.data.vorbis_comment.commentCount = drflac__le2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 06390 06391 /* Need space for 'commentCount' comments after the block, which at minimum is a drflac_uint32 per comment */ 06392 if ((pRunningDataEnd - pRunningData) / sizeof(drflac_uint32) < metadata.data.vorbis_comment.commentCount) { /* <-- Note the order of operations to avoid overflow to a valid value */ 06393 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06394 return DRFLAC_FALSE; 06395 } 06396 metadata.data.vorbis_comment.pComments = pRunningData; 06397 06398 /* Check that the comments section is valid before passing it to the callback */ 06399 for (i = 0; i < metadata.data.vorbis_comment.commentCount; ++i) { 06400 drflac_uint32 commentLength; 06401 06402 if (pRunningDataEnd - pRunningData < 4) { 06403 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06404 return DRFLAC_FALSE; 06405 } 06406 06407 commentLength = drflac__le2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 06408 if (pRunningDataEnd - pRunningData < (drflac_int64)commentLength) { /* <-- Note the order of operations to avoid overflow to a valid value */ 06409 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06410 return DRFLAC_FALSE; 06411 } 06412 pRunningData += commentLength; 06413 } 06414 06415 onMeta(pUserDataMD, &metadata); 06416 06417 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06418 } 06419 } break; 06420 06421 case DRFLAC_METADATA_BLOCK_TYPE_CUESHEET: 06422 { 06423 if (blockSize < 396) { 06424 return DRFLAC_FALSE; 06425 } 06426 06427 if (onMeta) { 06428 void* pRawData; 06429 const char* pRunningData; 06430 const char* pRunningDataEnd; 06431 drflac_uint8 iTrack; 06432 drflac_uint8 iIndex; 06433 06434 pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks); 06435 if (pRawData == NULL) { 06436 return DRFLAC_FALSE; 06437 } 06438 06439 if (onRead(pUserData, pRawData, blockSize) != blockSize) { 06440 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06441 return DRFLAC_FALSE; 06442 } 06443 06444 metadata.pRawData = pRawData; 06445 metadata.rawDataSize = blockSize; 06446 06447 pRunningData = (const char*)pRawData; 06448 pRunningDataEnd = (const char*)pRawData + blockSize; 06449 06450 DRFLAC_COPY_MEMORY(metadata.data.cuesheet.catalog, pRunningData, 128); pRunningData += 128; 06451 metadata.data.cuesheet.leadInSampleCount = drflac__be2host_64(*(const drflac_uint64*)pRunningData); pRunningData += 8; 06452 metadata.data.cuesheet.isCD = (pRunningData[0] & 0x80) != 0; pRunningData += 259; 06453 metadata.data.cuesheet.trackCount = pRunningData[0]; pRunningData += 1; 06454 metadata.data.cuesheet.pTrackData = pRunningData; 06455 06456 /* Check that the cuesheet tracks are valid before passing it to the callback */ 06457 for (iTrack = 0; iTrack < metadata.data.cuesheet.trackCount; ++iTrack) { 06458 drflac_uint8 indexCount; 06459 drflac_uint32 indexPointSize; 06460 06461 if (pRunningDataEnd - pRunningData < 36) { 06462 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06463 return DRFLAC_FALSE; 06464 } 06465 06466 /* Skip to the index point count */ 06467 pRunningData += 35; 06468 indexCount = pRunningData[0]; pRunningData += 1; 06469 indexPointSize = indexCount * sizeof(drflac_cuesheet_track_index); 06470 if (pRunningDataEnd - pRunningData < (drflac_int64)indexPointSize) { 06471 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06472 return DRFLAC_FALSE; 06473 } 06474 06475 /* Endian swap. */ 06476 for (iIndex = 0; iIndex < indexCount; ++iIndex) { 06477 drflac_cuesheet_track_index* pTrack = (drflac_cuesheet_track_index*)pRunningData; 06478 pRunningData += sizeof(drflac_cuesheet_track_index); 06479 pTrack->offset = drflac__be2host_64(pTrack->offset); 06480 } 06481 } 06482 06483 onMeta(pUserDataMD, &metadata); 06484 06485 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06486 } 06487 } break; 06488 06489 case DRFLAC_METADATA_BLOCK_TYPE_PICTURE: 06490 { 06491 if (blockSize < 32) { 06492 return DRFLAC_FALSE; 06493 } 06494 06495 if (onMeta) { 06496 void* pRawData; 06497 const char* pRunningData; 06498 const char* pRunningDataEnd; 06499 06500 pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks); 06501 if (pRawData == NULL) { 06502 return DRFLAC_FALSE; 06503 } 06504 06505 if (onRead(pUserData, pRawData, blockSize) != blockSize) { 06506 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06507 return DRFLAC_FALSE; 06508 } 06509 06510 metadata.pRawData = pRawData; 06511 metadata.rawDataSize = blockSize; 06512 06513 pRunningData = (const char*)pRawData; 06514 pRunningDataEnd = (const char*)pRawData + blockSize; 06515 06516 metadata.data.picture.type = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 06517 metadata.data.picture.mimeLength = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 06518 06519 /* Need space for the rest of the block */ 06520 if ((pRunningDataEnd - pRunningData) - 24 < (drflac_int64)metadata.data.picture.mimeLength) { /* <-- Note the order of operations to avoid overflow to a valid value */ 06521 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06522 return DRFLAC_FALSE; 06523 } 06524 metadata.data.picture.mime = pRunningData; pRunningData += metadata.data.picture.mimeLength; 06525 metadata.data.picture.descriptionLength = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 06526 06527 /* Need space for the rest of the block */ 06528 if ((pRunningDataEnd - pRunningData) - 20 < (drflac_int64)metadata.data.picture.descriptionLength) { /* <-- Note the order of operations to avoid overflow to a valid value */ 06529 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06530 return DRFLAC_FALSE; 06531 } 06532 metadata.data.picture.description = pRunningData; pRunningData += metadata.data.picture.descriptionLength; 06533 metadata.data.picture.width = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 06534 metadata.data.picture.height = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 06535 metadata.data.picture.colorDepth = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 06536 metadata.data.picture.indexColorCount = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 06537 metadata.data.picture.pictureDataSize = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 06538 metadata.data.picture.pPictureData = (const drflac_uint8*)pRunningData; 06539 06540 /* Need space for the picture after the block */ 06541 if (pRunningDataEnd - pRunningData < (drflac_int64)metadata.data.picture.pictureDataSize) { /* <-- Note the order of operations to avoid overflow to a valid value */ 06542 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06543 return DRFLAC_FALSE; 06544 } 06545 06546 onMeta(pUserDataMD, &metadata); 06547 06548 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06549 } 06550 } break; 06551 06552 case DRFLAC_METADATA_BLOCK_TYPE_PADDING: 06553 { 06554 if (onMeta) { 06555 metadata.data.padding.unused = 0; 06556 06557 /* Padding doesn't have anything meaningful in it, so just skip over it, but make sure the caller is aware of it by firing the callback. */ 06558 if (!onSeek(pUserData, blockSize, drflac_seek_origin_current)) { 06559 isLastBlock = DRFLAC_TRUE; /* An error occurred while seeking. Attempt to recover by treating this as the last block which will in turn terminate the loop. */ 06560 } else { 06561 onMeta(pUserDataMD, &metadata); 06562 } 06563 } 06564 } break; 06565 06566 case DRFLAC_METADATA_BLOCK_TYPE_INVALID: 06567 { 06568 /* Invalid chunk. Just skip over this one. */ 06569 if (onMeta) { 06570 if (!onSeek(pUserData, blockSize, drflac_seek_origin_current)) { 06571 isLastBlock = DRFLAC_TRUE; /* An error occurred while seeking. Attempt to recover by treating this as the last block which will in turn terminate the loop. */ 06572 } 06573 } 06574 } break; 06575 06576 default: 06577 { 06578 /* 06579 It's an unknown chunk, but not necessarily invalid. There's a chance more metadata blocks might be defined later on, so we 06580 can at the very least report the chunk to the application and let it look at the raw data. 06581 */ 06582 if (onMeta) { 06583 void* pRawData = drflac__malloc_from_callbacks(blockSize, pAllocationCallbacks); 06584 if (pRawData == NULL) { 06585 return DRFLAC_FALSE; 06586 } 06587 06588 if (onRead(pUserData, pRawData, blockSize) != blockSize) { 06589 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06590 return DRFLAC_FALSE; 06591 } 06592 06593 metadata.pRawData = pRawData; 06594 metadata.rawDataSize = blockSize; 06595 onMeta(pUserDataMD, &metadata); 06596 06597 drflac__free_from_callbacks(pRawData, pAllocationCallbacks); 06598 } 06599 } break; 06600 } 06601 06602 /* If we're not handling metadata, just skip over the block. If we are, it will have been handled earlier in the switch statement above. */ 06603 if (onMeta == NULL && blockSize > 0) { 06604 if (!onSeek(pUserData, blockSize, drflac_seek_origin_current)) { 06605 isLastBlock = DRFLAC_TRUE; 06606 } 06607 } 06608 06609 runningFilePos += blockSize; 06610 if (isLastBlock) { 06611 break; 06612 } 06613 } 06614 06615 *pSeektablePos = seektablePos; 06616 *pSeektableSize = seektableSize; 06617 *pFirstFramePos = runningFilePos; 06618 06619 return DRFLAC_TRUE; 06620 } 06621 06622 static drflac_bool32 drflac__init_private__native(drflac_init_info* pInit, drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, void* pUserDataMD, drflac_bool32 relaxed) 06623 { 06624 /* Pre Condition: The bit stream should be sitting just past the 4-byte id header. */ 06625 06626 drflac_uint8 isLastBlock; 06627 drflac_uint8 blockType; 06628 drflac_uint32 blockSize; 06629 06630 (void)onSeek; 06631 06632 pInit->container = drflac_container_native; 06633 06634 /* The first metadata block should be the STREAMINFO block. */ 06635 if (!drflac__read_and_decode_block_header(onRead, pUserData, &isLastBlock, &blockType, &blockSize)) { 06636 return DRFLAC_FALSE; 06637 } 06638 06639 if (blockType != DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO || blockSize != 34) { 06640 if (!relaxed) { 06641 /* We're opening in strict mode and the first block is not the STREAMINFO block. Error. */ 06642 return DRFLAC_FALSE; 06643 } else { 06644 /* 06645 Relaxed mode. To open from here we need to just find the first frame and set the sample rate, etc. to whatever is defined 06646 for that frame. 06647 */ 06648 pInit->hasStreamInfoBlock = DRFLAC_FALSE; 06649 pInit->hasMetadataBlocks = DRFLAC_FALSE; 06650 06651 if (!drflac__read_next_flac_frame_header(&pInit->bs, 0, &pInit->firstFrameHeader)) { 06652 return DRFLAC_FALSE; /* Couldn't find a frame. */ 06653 } 06654 06655 if (pInit->firstFrameHeader.bitsPerSample == 0) { 06656 return DRFLAC_FALSE; /* Failed to initialize because the first frame depends on the STREAMINFO block, which does not exist. */ 06657 } 06658 06659 pInit->sampleRate = pInit->firstFrameHeader.sampleRate; 06660 pInit->channels = drflac__get_channel_count_from_channel_assignment(pInit->firstFrameHeader.channelAssignment); 06661 pInit->bitsPerSample = pInit->firstFrameHeader.bitsPerSample; 06662 pInit->maxBlockSizeInPCMFrames = 65535; /* <-- See notes here: https://xiph.org/flac/format.html#metadata_block_streaminfo */ 06663 return DRFLAC_TRUE; 06664 } 06665 } else { 06666 drflac_streaminfo streaminfo; 06667 if (!drflac__read_streaminfo(onRead, pUserData, &streaminfo)) { 06668 return DRFLAC_FALSE; 06669 } 06670 06671 pInit->hasStreamInfoBlock = DRFLAC_TRUE; 06672 pInit->sampleRate = streaminfo.sampleRate; 06673 pInit->channels = streaminfo.channels; 06674 pInit->bitsPerSample = streaminfo.bitsPerSample; 06675 pInit->totalPCMFrameCount = streaminfo.totalPCMFrameCount; 06676 pInit->maxBlockSizeInPCMFrames = streaminfo.maxBlockSizeInPCMFrames; /* Don't care about the min block size - only the max (used for determining the size of the memory allocation). */ 06677 pInit->hasMetadataBlocks = !isLastBlock; 06678 06679 if (onMeta) { 06680 drflac_metadata metadata; 06681 metadata.type = DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO; 06682 metadata.pRawData = NULL; 06683 metadata.rawDataSize = 0; 06684 metadata.data.streaminfo = streaminfo; 06685 onMeta(pUserDataMD, &metadata); 06686 } 06687 06688 return DRFLAC_TRUE; 06689 } 06690 } 06691 06692 #ifndef DR_FLAC_NO_OGG 06693 #define DRFLAC_OGG_MAX_PAGE_SIZE 65307 06694 #define DRFLAC_OGG_CAPTURE_PATTERN_CRC32 1605413199 /* CRC-32 of "OggS". */ 06695 06696 typedef enum 06697 { 06698 drflac_ogg_recover_on_crc_mismatch, 06699 drflac_ogg_fail_on_crc_mismatch 06700 } drflac_ogg_crc_mismatch_recovery; 06701 06702 #ifndef DR_FLAC_NO_CRC 06703 static drflac_uint32 drflac__crc32_table[] = { 06704 0x00000000L, 0x04C11DB7L, 0x09823B6EL, 0x0D4326D9L, 06705 0x130476DCL, 0x17C56B6BL, 0x1A864DB2L, 0x1E475005L, 06706 0x2608EDB8L, 0x22C9F00FL, 0x2F8AD6D6L, 0x2B4BCB61L, 06707 0x350C9B64L, 0x31CD86D3L, 0x3C8EA00AL, 0x384FBDBDL, 06708 0x4C11DB70L, 0x48D0C6C7L, 0x4593E01EL, 0x4152FDA9L, 06709 0x5F15ADACL, 0x5BD4B01BL, 0x569796C2L, 0x52568B75L, 06710 0x6A1936C8L, 0x6ED82B7FL, 0x639B0DA6L, 0x675A1011L, 06711 0x791D4014L, 0x7DDC5DA3L, 0x709F7B7AL, 0x745E66CDL, 06712 0x9823B6E0L, 0x9CE2AB57L, 0x91A18D8EL, 0x95609039L, 06713 0x8B27C03CL, 0x8FE6DD8BL, 0x82A5FB52L, 0x8664E6E5L, 06714 0xBE2B5B58L, 0xBAEA46EFL, 0xB7A96036L, 0xB3687D81L, 06715 0xAD2F2D84L, 0xA9EE3033L, 0xA4AD16EAL, 0xA06C0B5DL, 06716 0xD4326D90L, 0xD0F37027L, 0xDDB056FEL, 0xD9714B49L, 06717 0xC7361B4CL, 0xC3F706FBL, 0xCEB42022L, 0xCA753D95L, 06718 0xF23A8028L, 0xF6FB9D9FL, 0xFBB8BB46L, 0xFF79A6F1L, 06719 0xE13EF6F4L, 0xE5FFEB43L, 0xE8BCCD9AL, 0xEC7DD02DL, 06720 0x34867077L, 0x30476DC0L, 0x3D044B19L, 0x39C556AEL, 06721 0x278206ABL, 0x23431B1CL, 0x2E003DC5L, 0x2AC12072L, 06722 0x128E9DCFL, 0x164F8078L, 0x1B0CA6A1L, 0x1FCDBB16L, 06723 0x018AEB13L, 0x054BF6A4L, 0x0808D07DL, 0x0CC9CDCAL, 06724 0x7897AB07L, 0x7C56B6B0L, 0x71159069L, 0x75D48DDEL, 06725 0x6B93DDDBL, 0x6F52C06CL, 0x6211E6B5L, 0x66D0FB02L, 06726 0x5E9F46BFL, 0x5A5E5B08L, 0x571D7DD1L, 0x53DC6066L, 06727 0x4D9B3063L, 0x495A2DD4L, 0x44190B0DL, 0x40D816BAL, 06728 0xACA5C697L, 0xA864DB20L, 0xA527FDF9L, 0xA1E6E04EL, 06729 0xBFA1B04BL, 0xBB60ADFCL, 0xB6238B25L, 0xB2E29692L, 06730 0x8AAD2B2FL, 0x8E6C3698L, 0x832F1041L, 0x87EE0DF6L, 06731 0x99A95DF3L, 0x9D684044L, 0x902B669DL, 0x94EA7B2AL, 06732 0xE0B41DE7L, 0xE4750050L, 0xE9362689L, 0xEDF73B3EL, 06733 0xF3B06B3BL, 0xF771768CL, 0xFA325055L, 0xFEF34DE2L, 06734 0xC6BCF05FL, 0xC27DEDE8L, 0xCF3ECB31L, 0xCBFFD686L, 06735 0xD5B88683L, 0xD1799B34L, 0xDC3ABDEDL, 0xD8FBA05AL, 06736 0x690CE0EEL, 0x6DCDFD59L, 0x608EDB80L, 0x644FC637L, 06737 0x7A089632L, 0x7EC98B85L, 0x738AAD5CL, 0x774BB0EBL, 06738 0x4F040D56L, 0x4BC510E1L, 0x46863638L, 0x42472B8FL, 06739 0x5C007B8AL, 0x58C1663DL, 0x558240E4L, 0x51435D53L, 06740 0x251D3B9EL, 0x21DC2629L, 0x2C9F00F0L, 0x285E1D47L, 06741 0x36194D42L, 0x32D850F5L, 0x3F9B762CL, 0x3B5A6B9BL, 06742 0x0315D626L, 0x07D4CB91L, 0x0A97ED48L, 0x0E56F0FFL, 06743 0x1011A0FAL, 0x14D0BD4DL, 0x19939B94L, 0x1D528623L, 06744 0xF12F560EL, 0xF5EE4BB9L, 0xF8AD6D60L, 0xFC6C70D7L, 06745 0xE22B20D2L, 0xE6EA3D65L, 0xEBA91BBCL, 0xEF68060BL, 06746 0xD727BBB6L, 0xD3E6A601L, 0xDEA580D8L, 0xDA649D6FL, 06747 0xC423CD6AL, 0xC0E2D0DDL, 0xCDA1F604L, 0xC960EBB3L, 06748 0xBD3E8D7EL, 0xB9FF90C9L, 0xB4BCB610L, 0xB07DABA7L, 06749 0xAE3AFBA2L, 0xAAFBE615L, 0xA7B8C0CCL, 0xA379DD7BL, 06750 0x9B3660C6L, 0x9FF77D71L, 0x92B45BA8L, 0x9675461FL, 06751 0x8832161AL, 0x8CF30BADL, 0x81B02D74L, 0x857130C3L, 06752 0x5D8A9099L, 0x594B8D2EL, 0x5408ABF7L, 0x50C9B640L, 06753 0x4E8EE645L, 0x4A4FFBF2L, 0x470CDD2BL, 0x43CDC09CL, 06754 0x7B827D21L, 0x7F436096L, 0x7200464FL, 0x76C15BF8L, 06755 0x68860BFDL, 0x6C47164AL, 0x61043093L, 0x65C52D24L, 06756 0x119B4BE9L, 0x155A565EL, 0x18197087L, 0x1CD86D30L, 06757 0x029F3D35L, 0x065E2082L, 0x0B1D065BL, 0x0FDC1BECL, 06758 0x3793A651L, 0x3352BBE6L, 0x3E119D3FL, 0x3AD08088L, 06759 0x2497D08DL, 0x2056CD3AL, 0x2D15EBE3L, 0x29D4F654L, 06760 0xC5A92679L, 0xC1683BCEL, 0xCC2B1D17L, 0xC8EA00A0L, 06761 0xD6AD50A5L, 0xD26C4D12L, 0xDF2F6BCBL, 0xDBEE767CL, 06762 0xE3A1CBC1L, 0xE760D676L, 0xEA23F0AFL, 0xEEE2ED18L, 06763 0xF0A5BD1DL, 0xF464A0AAL, 0xF9278673L, 0xFDE69BC4L, 06764 0x89B8FD09L, 0x8D79E0BEL, 0x803AC667L, 0x84FBDBD0L, 06765 0x9ABC8BD5L, 0x9E7D9662L, 0x933EB0BBL, 0x97FFAD0CL, 06766 0xAFB010B1L, 0xAB710D06L, 0xA6322BDFL, 0xA2F33668L, 06767 0xBCB4666DL, 0xB8757BDAL, 0xB5365D03L, 0xB1F740B4L 06768 }; 06769 #endif 06770 06771 static DRFLAC_INLINE drflac_uint32 drflac_crc32_byte(drflac_uint32 crc32, drflac_uint8 data) 06772 { 06773 #ifndef DR_FLAC_NO_CRC 06774 return (crc32 << 8) ^ drflac__crc32_table[(drflac_uint8)((crc32 >> 24) & 0xFF) ^ data]; 06775 #else 06776 (void)data; 06777 return crc32; 06778 #endif 06779 } 06780 06781 #if 0 06782 static DRFLAC_INLINE drflac_uint32 drflac_crc32_uint32(drflac_uint32 crc32, drflac_uint32 data) 06783 { 06784 crc32 = drflac_crc32_byte(crc32, (drflac_uint8)((data >> 24) & 0xFF)); 06785 crc32 = drflac_crc32_byte(crc32, (drflac_uint8)((data >> 16) & 0xFF)); 06786 crc32 = drflac_crc32_byte(crc32, (drflac_uint8)((data >> 8) & 0xFF)); 06787 crc32 = drflac_crc32_byte(crc32, (drflac_uint8)((data >> 0) & 0xFF)); 06788 return crc32; 06789 } 06790 06791 static DRFLAC_INLINE drflac_uint32 drflac_crc32_uint64(drflac_uint32 crc32, drflac_uint64 data) 06792 { 06793 crc32 = drflac_crc32_uint32(crc32, (drflac_uint32)((data >> 32) & 0xFFFFFFFF)); 06794 crc32 = drflac_crc32_uint32(crc32, (drflac_uint32)((data >> 0) & 0xFFFFFFFF)); 06795 return crc32; 06796 } 06797 #endif 06798 06799 static DRFLAC_INLINE drflac_uint32 drflac_crc32_buffer(drflac_uint32 crc32, drflac_uint8* pData, drflac_uint32 dataSize) 06800 { 06801 /* This can be optimized. */ 06802 drflac_uint32 i; 06803 for (i = 0; i < dataSize; ++i) { 06804 crc32 = drflac_crc32_byte(crc32, pData[i]); 06805 } 06806 return crc32; 06807 } 06808 06809 06810 static DRFLAC_INLINE drflac_bool32 drflac_ogg__is_capture_pattern(drflac_uint8 pattern[4]) 06811 { 06812 return pattern[0] == 'O' && pattern[1] == 'g' && pattern[2] == 'g' && pattern[3] == 'S'; 06813 } 06814 06815 static DRFLAC_INLINE drflac_uint32 drflac_ogg__get_page_header_size(drflac_ogg_page_header* pHeader) 06816 { 06817 return 27 + pHeader->segmentCount; 06818 } 06819 06820 static DRFLAC_INLINE drflac_uint32 drflac_ogg__get_page_body_size(drflac_ogg_page_header* pHeader) 06821 { 06822 drflac_uint32 pageBodySize = 0; 06823 int i; 06824 06825 for (i = 0; i < pHeader->segmentCount; ++i) { 06826 pageBodySize += pHeader->segmentTable[i]; 06827 } 06828 06829 return pageBodySize; 06830 } 06831 06832 static drflac_result drflac_ogg__read_page_header_after_capture_pattern(drflac_read_proc onRead, void* pUserData, drflac_ogg_page_header* pHeader, drflac_uint32* pBytesRead, drflac_uint32* pCRC32) 06833 { 06834 drflac_uint8 data[23]; 06835 drflac_uint32 i; 06836 06837 DRFLAC_ASSERT(*pCRC32 == DRFLAC_OGG_CAPTURE_PATTERN_CRC32); 06838 06839 if (onRead(pUserData, data, 23) != 23) { 06840 return DRFLAC_AT_END; 06841 } 06842 *pBytesRead += 23; 06843 06844 /* 06845 It's not actually used, but set the capture pattern to 'OggS' for completeness. Not doing this will cause static analysers to complain about 06846 us trying to access uninitialized data. We could alternatively just comment out this member of the drflac_ogg_page_header structure, but I 06847 like to have it map to the structure of the underlying data. 06848 */ 06849 pHeader->capturePattern[0] = 'O'; 06850 pHeader->capturePattern[1] = 'g'; 06851 pHeader->capturePattern[2] = 'g'; 06852 pHeader->capturePattern[3] = 'S'; 06853 06854 pHeader->structureVersion = data[0]; 06855 pHeader->headerType = data[1]; 06856 DRFLAC_COPY_MEMORY(&pHeader->granulePosition, &data[ 2], 8); 06857 DRFLAC_COPY_MEMORY(&pHeader->serialNumber, &data[10], 4); 06858 DRFLAC_COPY_MEMORY(&pHeader->sequenceNumber, &data[14], 4); 06859 DRFLAC_COPY_MEMORY(&pHeader->checksum, &data[18], 4); 06860 pHeader->segmentCount = data[22]; 06861 06862 /* Calculate the CRC. Note that for the calculation the checksum part of the page needs to be set to 0. */ 06863 data[18] = 0; 06864 data[19] = 0; 06865 data[20] = 0; 06866 data[21] = 0; 06867 06868 for (i = 0; i < 23; ++i) { 06869 *pCRC32 = drflac_crc32_byte(*pCRC32, data[i]); 06870 } 06871 06872 06873 if (onRead(pUserData, pHeader->segmentTable, pHeader->segmentCount) != pHeader->segmentCount) { 06874 return DRFLAC_AT_END; 06875 } 06876 *pBytesRead += pHeader->segmentCount; 06877 06878 for (i = 0; i < pHeader->segmentCount; ++i) { 06879 *pCRC32 = drflac_crc32_byte(*pCRC32, pHeader->segmentTable[i]); 06880 } 06881 06882 return DRFLAC_SUCCESS; 06883 } 06884 06885 static drflac_result drflac_ogg__read_page_header(drflac_read_proc onRead, void* pUserData, drflac_ogg_page_header* pHeader, drflac_uint32* pBytesRead, drflac_uint32* pCRC32) 06886 { 06887 drflac_uint8 id[4]; 06888 06889 *pBytesRead = 0; 06890 06891 if (onRead(pUserData, id, 4) != 4) { 06892 return DRFLAC_AT_END; 06893 } 06894 *pBytesRead += 4; 06895 06896 /* We need to read byte-by-byte until we find the OggS capture pattern. */ 06897 for (;;) { 06898 if (drflac_ogg__is_capture_pattern(id)) { 06899 drflac_result result; 06900 06901 *pCRC32 = DRFLAC_OGG_CAPTURE_PATTERN_CRC32; 06902 06903 result = drflac_ogg__read_page_header_after_capture_pattern(onRead, pUserData, pHeader, pBytesRead, pCRC32); 06904 if (result == DRFLAC_SUCCESS) { 06905 return DRFLAC_SUCCESS; 06906 } else { 06907 if (result == DRFLAC_CRC_MISMATCH) { 06908 continue; 06909 } else { 06910 return result; 06911 } 06912 } 06913 } else { 06914 /* The first 4 bytes did not equal the capture pattern. Read the next byte and try again. */ 06915 id[0] = id[1]; 06916 id[1] = id[2]; 06917 id[2] = id[3]; 06918 if (onRead(pUserData, &id[3], 1) != 1) { 06919 return DRFLAC_AT_END; 06920 } 06921 *pBytesRead += 1; 06922 } 06923 } 06924 } 06925 06926 06927 /* 06928 The main part of the Ogg encapsulation is the conversion from the physical Ogg bitstream to the native FLAC bitstream. It works 06929 in three general stages: Ogg Physical Bitstream -> Ogg/FLAC Logical Bitstream -> FLAC Native Bitstream. dr_flac is designed 06930 in such a way that the core sections assume everything is delivered in native format. Therefore, for each encapsulation type 06931 dr_flac is supporting there needs to be a layer sitting on top of the onRead and onSeek callbacks that ensures the bits read from 06932 the physical Ogg bitstream are converted and delivered in native FLAC format. 06933 */ 06934 typedef struct 06935 { 06936 drflac_read_proc onRead; /* The original onRead callback from drflac_open() and family. */ 06937 drflac_seek_proc onSeek; /* The original onSeek callback from drflac_open() and family. */ 06938 void* pUserData; /* The user data passed on onRead and onSeek. This is the user data that was passed on drflac_open() and family. */ 06939 drflac_uint64 currentBytePos; /* The position of the byte we are sitting on in the physical byte stream. Used for efficient seeking. */ 06940 drflac_uint64 firstBytePos; /* The position of the first byte in the physical bitstream. Points to the start of the "OggS" identifier of the FLAC bos page. */ 06941 drflac_uint32 serialNumber; /* The serial number of the FLAC audio pages. This is determined by the initial header page that was read during initialization. */ 06942 drflac_ogg_page_header bosPageHeader; /* Used for seeking. */ 06943 drflac_ogg_page_header currentPageHeader; 06944 drflac_uint32 bytesRemainingInPage; 06945 drflac_uint32 pageDataSize; 06946 drflac_uint8 pageData[DRFLAC_OGG_MAX_PAGE_SIZE]; 06947 } drflac_oggbs; /* oggbs = Ogg Bitstream */ 06948 06949 static size_t drflac_oggbs__read_physical(drflac_oggbs* oggbs, void* bufferOut, size_t bytesToRead) 06950 { 06951 size_t bytesActuallyRead = oggbs->onRead(oggbs->pUserData, bufferOut, bytesToRead); 06952 oggbs->currentBytePos += bytesActuallyRead; 06953 06954 return bytesActuallyRead; 06955 } 06956 06957 static drflac_bool32 drflac_oggbs__seek_physical(drflac_oggbs* oggbs, drflac_uint64 offset, drflac_seek_origin origin) 06958 { 06959 if (origin == drflac_seek_origin_start) { 06960 if (offset <= 0x7FFFFFFF) { 06961 if (!oggbs->onSeek(oggbs->pUserData, (int)offset, drflac_seek_origin_start)) { 06962 return DRFLAC_FALSE; 06963 } 06964 oggbs->currentBytePos = offset; 06965 06966 return DRFLAC_TRUE; 06967 } else { 06968 if (!oggbs->onSeek(oggbs->pUserData, 0x7FFFFFFF, drflac_seek_origin_start)) { 06969 return DRFLAC_FALSE; 06970 } 06971 oggbs->currentBytePos = offset; 06972 06973 return drflac_oggbs__seek_physical(oggbs, offset - 0x7FFFFFFF, drflac_seek_origin_current); 06974 } 06975 } else { 06976 while (offset > 0x7FFFFFFF) { 06977 if (!oggbs->onSeek(oggbs->pUserData, 0x7FFFFFFF, drflac_seek_origin_current)) { 06978 return DRFLAC_FALSE; 06979 } 06980 oggbs->currentBytePos += 0x7FFFFFFF; 06981 offset -= 0x7FFFFFFF; 06982 } 06983 06984 if (!oggbs->onSeek(oggbs->pUserData, (int)offset, drflac_seek_origin_current)) { /* <-- Safe cast thanks to the loop above. */ 06985 return DRFLAC_FALSE; 06986 } 06987 oggbs->currentBytePos += offset; 06988 06989 return DRFLAC_TRUE; 06990 } 06991 } 06992 06993 static drflac_bool32 drflac_oggbs__goto_next_page(drflac_oggbs* oggbs, drflac_ogg_crc_mismatch_recovery recoveryMethod) 06994 { 06995 drflac_ogg_page_header header; 06996 for (;;) { 06997 drflac_uint32 crc32 = 0; 06998 drflac_uint32 bytesRead; 06999 drflac_uint32 pageBodySize; 07000 #ifndef DR_FLAC_NO_CRC 07001 drflac_uint32 actualCRC32; 07002 #endif 07003 07004 if (drflac_ogg__read_page_header(oggbs->onRead, oggbs->pUserData, &header, &bytesRead, &crc32) != DRFLAC_SUCCESS) { 07005 return DRFLAC_FALSE; 07006 } 07007 oggbs->currentBytePos += bytesRead; 07008 07009 pageBodySize = drflac_ogg__get_page_body_size(&header); 07010 if (pageBodySize > DRFLAC_OGG_MAX_PAGE_SIZE) { 07011 continue; /* Invalid page size. Assume it's corrupted and just move to the next page. */ 07012 } 07013 07014 if (header.serialNumber != oggbs->serialNumber) { 07015 /* It's not a FLAC page. Skip it. */ 07016 if (pageBodySize > 0 && !drflac_oggbs__seek_physical(oggbs, pageBodySize, drflac_seek_origin_current)) { 07017 return DRFLAC_FALSE; 07018 } 07019 continue; 07020 } 07021 07022 07023 /* We need to read the entire page and then do a CRC check on it. If there's a CRC mismatch we need to skip this page. */ 07024 if (drflac_oggbs__read_physical(oggbs, oggbs->pageData, pageBodySize) != pageBodySize) { 07025 return DRFLAC_FALSE; 07026 } 07027 oggbs->pageDataSize = pageBodySize; 07028 07029 #ifndef DR_FLAC_NO_CRC 07030 actualCRC32 = drflac_crc32_buffer(crc32, oggbs->pageData, oggbs->pageDataSize); 07031 if (actualCRC32 != header.checksum) { 07032 if (recoveryMethod == drflac_ogg_recover_on_crc_mismatch) { 07033 continue; /* CRC mismatch. Skip this page. */ 07034 } else { 07035 /* 07036 Even though we are failing on a CRC mismatch, we still want our stream to be in a good state. Therefore we 07037 go to the next valid page to ensure we're in a good state, but return false to let the caller know that the 07038 seek did not fully complete. 07039 */ 07040 drflac_oggbs__goto_next_page(oggbs, drflac_ogg_recover_on_crc_mismatch); 07041 return DRFLAC_FALSE; 07042 } 07043 } 07044 #else 07045 (void)recoveryMethod; /* <-- Silence a warning. */ 07046 #endif 07047 07048 oggbs->currentPageHeader = header; 07049 oggbs->bytesRemainingInPage = pageBodySize; 07050 return DRFLAC_TRUE; 07051 } 07052 } 07053 07054 /* Function below is unused at the moment, but I might be re-adding it later. */ 07055 #if 0 07056 static drflac_uint8 drflac_oggbs__get_current_segment_index(drflac_oggbs* oggbs, drflac_uint8* pBytesRemainingInSeg) 07057 { 07058 drflac_uint32 bytesConsumedInPage = drflac_ogg__get_page_body_size(&oggbs->currentPageHeader) - oggbs->bytesRemainingInPage; 07059 drflac_uint8 iSeg = 0; 07060 drflac_uint32 iByte = 0; 07061 while (iByte < bytesConsumedInPage) { 07062 drflac_uint8 segmentSize = oggbs->currentPageHeader.segmentTable[iSeg]; 07063 if (iByte + segmentSize > bytesConsumedInPage) { 07064 break; 07065 } else { 07066 iSeg += 1; 07067 iByte += segmentSize; 07068 } 07069 } 07070 07071 *pBytesRemainingInSeg = oggbs->currentPageHeader.segmentTable[iSeg] - (drflac_uint8)(bytesConsumedInPage - iByte); 07072 return iSeg; 07073 } 07074 07075 static drflac_bool32 drflac_oggbs__seek_to_next_packet(drflac_oggbs* oggbs) 07076 { 07077 /* The current packet ends when we get to the segment with a lacing value of < 255 which is not at the end of a page. */ 07078 for (;;) { 07079 drflac_bool32 atEndOfPage = DRFLAC_FALSE; 07080 07081 drflac_uint8 bytesRemainingInSeg; 07082 drflac_uint8 iFirstSeg = drflac_oggbs__get_current_segment_index(oggbs, &bytesRemainingInSeg); 07083 07084 drflac_uint32 bytesToEndOfPacketOrPage = bytesRemainingInSeg; 07085 for (drflac_uint8 iSeg = iFirstSeg; iSeg < oggbs->currentPageHeader.segmentCount; ++iSeg) { 07086 drflac_uint8 segmentSize = oggbs->currentPageHeader.segmentTable[iSeg]; 07087 if (segmentSize < 255) { 07088 if (iSeg == oggbs->currentPageHeader.segmentCount-1) { 07089 atEndOfPage = DRFLAC_TRUE; 07090 } 07091 07092 break; 07093 } 07094 07095 bytesToEndOfPacketOrPage += segmentSize; 07096 } 07097 07098 /* 07099 At this point we will have found either the packet or the end of the page. If were at the end of the page we'll 07100 want to load the next page and keep searching for the end of the packet. 07101 */ 07102 drflac_oggbs__seek_physical(oggbs, bytesToEndOfPacketOrPage, drflac_seek_origin_current); 07103 oggbs->bytesRemainingInPage -= bytesToEndOfPacketOrPage; 07104 07105 if (atEndOfPage) { 07106 /* 07107 We're potentially at the next packet, but we need to check the next page first to be sure because the packet may 07108 straddle pages. 07109 */ 07110 if (!drflac_oggbs__goto_next_page(oggbs)) { 07111 return DRFLAC_FALSE; 07112 } 07113 07114 /* If it's a fresh packet it most likely means we're at the next packet. */ 07115 if ((oggbs->currentPageHeader.headerType & 0x01) == 0) { 07116 return DRFLAC_TRUE; 07117 } 07118 } else { 07119 /* We're at the next packet. */ 07120 return DRFLAC_TRUE; 07121 } 07122 } 07123 } 07124 07125 static drflac_bool32 drflac_oggbs__seek_to_next_frame(drflac_oggbs* oggbs) 07126 { 07127 /* The bitstream should be sitting on the first byte just after the header of the frame. */ 07128 07129 /* What we're actually doing here is seeking to the start of the next packet. */ 07130 return drflac_oggbs__seek_to_next_packet(oggbs); 07131 } 07132 #endif 07133 07134 static size_t drflac__on_read_ogg(void* pUserData, void* bufferOut, size_t bytesToRead) 07135 { 07136 drflac_oggbs* oggbs = (drflac_oggbs*)pUserData; 07137 drflac_uint8* pRunningBufferOut = (drflac_uint8*)bufferOut; 07138 size_t bytesRead = 0; 07139 07140 DRFLAC_ASSERT(oggbs != NULL); 07141 DRFLAC_ASSERT(pRunningBufferOut != NULL); 07142 07143 /* Reading is done page-by-page. If we've run out of bytes in the page we need to move to the next one. */ 07144 while (bytesRead < bytesToRead) { 07145 size_t bytesRemainingToRead = bytesToRead - bytesRead; 07146 07147 if (oggbs->bytesRemainingInPage >= bytesRemainingToRead) { 07148 DRFLAC_COPY_MEMORY(pRunningBufferOut, oggbs->pageData + (oggbs->pageDataSize - oggbs->bytesRemainingInPage), bytesRemainingToRead); 07149 bytesRead += bytesRemainingToRead; 07150 oggbs->bytesRemainingInPage -= (drflac_uint32)bytesRemainingToRead; 07151 break; 07152 } 07153 07154 /* If we get here it means some of the requested data is contained in the next pages. */ 07155 if (oggbs->bytesRemainingInPage > 0) { 07156 DRFLAC_COPY_MEMORY(pRunningBufferOut, oggbs->pageData + (oggbs->pageDataSize - oggbs->bytesRemainingInPage), oggbs->bytesRemainingInPage); 07157 bytesRead += oggbs->bytesRemainingInPage; 07158 pRunningBufferOut += oggbs->bytesRemainingInPage; 07159 oggbs->bytesRemainingInPage = 0; 07160 } 07161 07162 DRFLAC_ASSERT(bytesRemainingToRead > 0); 07163 if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_recover_on_crc_mismatch)) { 07164 break; /* Failed to go to the next page. Might have simply hit the end of the stream. */ 07165 } 07166 } 07167 07168 return bytesRead; 07169 } 07170 07171 static drflac_bool32 drflac__on_seek_ogg(void* pUserData, int offset, drflac_seek_origin origin) 07172 { 07173 drflac_oggbs* oggbs = (drflac_oggbs*)pUserData; 07174 int bytesSeeked = 0; 07175 07176 DRFLAC_ASSERT(oggbs != NULL); 07177 DRFLAC_ASSERT(offset >= 0); /* <-- Never seek backwards. */ 07178 07179 /* Seeking is always forward which makes things a lot simpler. */ 07180 if (origin == drflac_seek_origin_start) { 07181 if (!drflac_oggbs__seek_physical(oggbs, (int)oggbs->firstBytePos, drflac_seek_origin_start)) { 07182 return DRFLAC_FALSE; 07183 } 07184 07185 if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_fail_on_crc_mismatch)) { 07186 return DRFLAC_FALSE; 07187 } 07188 07189 return drflac__on_seek_ogg(pUserData, offset, drflac_seek_origin_current); 07190 } 07191 07192 DRFLAC_ASSERT(origin == drflac_seek_origin_current); 07193 07194 while (bytesSeeked < offset) { 07195 int bytesRemainingToSeek = offset - bytesSeeked; 07196 DRFLAC_ASSERT(bytesRemainingToSeek >= 0); 07197 07198 if (oggbs->bytesRemainingInPage >= (size_t)bytesRemainingToSeek) { 07199 bytesSeeked += bytesRemainingToSeek; 07200 (void)bytesSeeked; /* <-- Silence a dead store warning emitted by Clang Static Analyzer. */ 07201 oggbs->bytesRemainingInPage -= bytesRemainingToSeek; 07202 break; 07203 } 07204 07205 /* If we get here it means some of the requested data is contained in the next pages. */ 07206 if (oggbs->bytesRemainingInPage > 0) { 07207 bytesSeeked += (int)oggbs->bytesRemainingInPage; 07208 oggbs->bytesRemainingInPage = 0; 07209 } 07210 07211 DRFLAC_ASSERT(bytesRemainingToSeek > 0); 07212 if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_fail_on_crc_mismatch)) { 07213 /* Failed to go to the next page. We either hit the end of the stream or had a CRC mismatch. */ 07214 return DRFLAC_FALSE; 07215 } 07216 } 07217 07218 return DRFLAC_TRUE; 07219 } 07220 07221 07222 static drflac_bool32 drflac_ogg__seek_to_pcm_frame(drflac* pFlac, drflac_uint64 pcmFrameIndex) 07223 { 07224 drflac_oggbs* oggbs = (drflac_oggbs*)pFlac->_oggbs; 07225 drflac_uint64 originalBytePos; 07226 drflac_uint64 runningGranulePosition; 07227 drflac_uint64 runningFrameBytePos; 07228 drflac_uint64 runningPCMFrameCount; 07229 07230 DRFLAC_ASSERT(oggbs != NULL); 07231 07232 originalBytePos = oggbs->currentBytePos; /* For recovery. Points to the OggS identifier. */ 07233 07234 /* First seek to the first frame. */ 07235 if (!drflac__seek_to_byte(&pFlac->bs, pFlac->firstFLACFramePosInBytes)) { 07236 return DRFLAC_FALSE; 07237 } 07238 oggbs->bytesRemainingInPage = 0; 07239 07240 runningGranulePosition = 0; 07241 for (;;) { 07242 if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_recover_on_crc_mismatch)) { 07243 drflac_oggbs__seek_physical(oggbs, originalBytePos, drflac_seek_origin_start); 07244 return DRFLAC_FALSE; /* Never did find that sample... */ 07245 } 07246 07247 runningFrameBytePos = oggbs->currentBytePos - drflac_ogg__get_page_header_size(&oggbs->currentPageHeader) - oggbs->pageDataSize; 07248 if (oggbs->currentPageHeader.granulePosition >= pcmFrameIndex) { 07249 break; /* The sample is somewhere in the previous page. */ 07250 } 07251 07252 /* 07253 At this point we know the sample is not in the previous page. It could possibly be in this page. For simplicity we 07254 disregard any pages that do not begin a fresh packet. 07255 */ 07256 if ((oggbs->currentPageHeader.headerType & 0x01) == 0) { /* <-- Is it a fresh page? */ 07257 if (oggbs->currentPageHeader.segmentTable[0] >= 2) { 07258 drflac_uint8 firstBytesInPage[2]; 07259 firstBytesInPage[0] = oggbs->pageData[0]; 07260 firstBytesInPage[1] = oggbs->pageData[1]; 07261 07262 if ((firstBytesInPage[0] == 0xFF) && (firstBytesInPage[1] & 0xFC) == 0xF8) { /* <-- Does the page begin with a frame's sync code? */ 07263 runningGranulePosition = oggbs->currentPageHeader.granulePosition; 07264 } 07265 07266 continue; 07267 } 07268 } 07269 } 07270 07271 /* 07272 We found the page that that is closest to the sample, so now we need to find it. The first thing to do is seek to the 07273 start of that page. In the loop above we checked that it was a fresh page which means this page is also the start of 07274 a new frame. This property means that after we've seeked to the page we can immediately start looping over frames until 07275 we find the one containing the target sample. 07276 */ 07277 if (!drflac_oggbs__seek_physical(oggbs, runningFrameBytePos, drflac_seek_origin_start)) { 07278 return DRFLAC_FALSE; 07279 } 07280 if (!drflac_oggbs__goto_next_page(oggbs, drflac_ogg_recover_on_crc_mismatch)) { 07281 return DRFLAC_FALSE; 07282 } 07283 07284 /* 07285 At this point we'll be sitting on the first byte of the frame header of the first frame in the page. We just keep 07286 looping over these frames until we find the one containing the sample we're after. 07287 */ 07288 runningPCMFrameCount = runningGranulePosition; 07289 for (;;) { 07290 /* 07291 There are two ways to find the sample and seek past irrelevant frames: 07292 1) Use the native FLAC decoder. 07293 2) Use Ogg's framing system. 07294 07295 Both of these options have their own pros and cons. Using the native FLAC decoder is slower because it needs to 07296 do a full decode of the frame. Using Ogg's framing system is faster, but more complicated and involves some code 07297 duplication for the decoding of frame headers. 07298 07299 Another thing to consider is that using the Ogg framing system will perform direct seeking of the physical Ogg 07300 bitstream. This is important to consider because it means we cannot read data from the drflac_bs object using the 07301 standard drflac__*() APIs because that will read in extra data for its own internal caching which in turn breaks 07302 the positioning of the read pointer of the physical Ogg bitstream. Therefore, anything that would normally be read 07303 using the native FLAC decoding APIs, such as drflac__read_next_flac_frame_header(), need to be re-implemented so as to 07304 avoid the use of the drflac_bs object. 07305 07306 Considering these issues, I have decided to use the slower native FLAC decoding method for the following reasons: 07307 1) Seeking is already partially accelerated using Ogg's paging system in the code block above. 07308 2) Seeking in an Ogg encapsulated FLAC stream is probably quite uncommon. 07309 3) Simplicity. 07310 */ 07311 drflac_uint64 firstPCMFrameInFLACFrame = 0; 07312 drflac_uint64 lastPCMFrameInFLACFrame = 0; 07313 drflac_uint64 pcmFrameCountInThisFrame; 07314 07315 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 07316 return DRFLAC_FALSE; 07317 } 07318 07319 drflac__get_pcm_frame_range_of_current_flac_frame(pFlac, &firstPCMFrameInFLACFrame, &lastPCMFrameInFLACFrame); 07320 07321 pcmFrameCountInThisFrame = (lastPCMFrameInFLACFrame - firstPCMFrameInFLACFrame) + 1; 07322 07323 /* If we are seeking to the end of the file and we've just hit it, we're done. */ 07324 if (pcmFrameIndex == pFlac->totalPCMFrameCount && (runningPCMFrameCount + pcmFrameCountInThisFrame) == pFlac->totalPCMFrameCount) { 07325 drflac_result result = drflac__decode_flac_frame(pFlac); 07326 if (result == DRFLAC_SUCCESS) { 07327 pFlac->currentPCMFrame = pcmFrameIndex; 07328 pFlac->currentFLACFrame.pcmFramesRemaining = 0; 07329 return DRFLAC_TRUE; 07330 } else { 07331 return DRFLAC_FALSE; 07332 } 07333 } 07334 07335 if (pcmFrameIndex < (runningPCMFrameCount + pcmFrameCountInThisFrame)) { 07336 /* 07337 The sample should be in this FLAC frame. We need to fully decode it, however if it's an invalid frame (a CRC mismatch), we need to pretend 07338 it never existed and keep iterating. 07339 */ 07340 drflac_result result = drflac__decode_flac_frame(pFlac); 07341 if (result == DRFLAC_SUCCESS) { 07342 /* The frame is valid. We just need to skip over some samples to ensure it's sample-exact. */ 07343 drflac_uint64 pcmFramesToDecode = (size_t)(pcmFrameIndex - runningPCMFrameCount); /* <-- Safe cast because the maximum number of samples in a frame is 65535. */ 07344 if (pcmFramesToDecode == 0) { 07345 return DRFLAC_TRUE; 07346 } 07347 07348 pFlac->currentPCMFrame = runningPCMFrameCount; 07349 07350 return drflac__seek_forward_by_pcm_frames(pFlac, pcmFramesToDecode) == pcmFramesToDecode; /* <-- If this fails, something bad has happened (it should never fail). */ 07351 } else { 07352 if (result == DRFLAC_CRC_MISMATCH) { 07353 continue; /* CRC mismatch. Pretend this frame never existed. */ 07354 } else { 07355 return DRFLAC_FALSE; 07356 } 07357 } 07358 } else { 07359 /* 07360 It's not in this frame. We need to seek past the frame, but check if there was a CRC mismatch. If so, we pretend this 07361 frame never existed and leave the running sample count untouched. 07362 */ 07363 drflac_result result = drflac__seek_to_next_flac_frame(pFlac); 07364 if (result == DRFLAC_SUCCESS) { 07365 runningPCMFrameCount += pcmFrameCountInThisFrame; 07366 } else { 07367 if (result == DRFLAC_CRC_MISMATCH) { 07368 continue; /* CRC mismatch. Pretend this frame never existed. */ 07369 } else { 07370 return DRFLAC_FALSE; 07371 } 07372 } 07373 } 07374 } 07375 } 07376 07377 07378 07379 static drflac_bool32 drflac__init_private__ogg(drflac_init_info* pInit, drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, void* pUserDataMD, drflac_bool32 relaxed) 07380 { 07381 drflac_ogg_page_header header; 07382 drflac_uint32 crc32 = DRFLAC_OGG_CAPTURE_PATTERN_CRC32; 07383 drflac_uint32 bytesRead = 0; 07384 07385 /* Pre Condition: The bit stream should be sitting just past the 4-byte OggS capture pattern. */ 07386 (void)relaxed; 07387 07388 pInit->container = drflac_container_ogg; 07389 pInit->oggFirstBytePos = 0; 07390 07391 /* 07392 We'll get here if the first 4 bytes of the stream were the OggS capture pattern, however it doesn't necessarily mean the 07393 stream includes FLAC encoded audio. To check for this we need to scan the beginning-of-stream page markers and check if 07394 any match the FLAC specification. Important to keep in mind that the stream may be multiplexed. 07395 */ 07396 if (drflac_ogg__read_page_header_after_capture_pattern(onRead, pUserData, &header, &bytesRead, &crc32) != DRFLAC_SUCCESS) { 07397 return DRFLAC_FALSE; 07398 } 07399 pInit->runningFilePos += bytesRead; 07400 07401 for (;;) { 07402 int pageBodySize; 07403 07404 /* Break if we're past the beginning of stream page. */ 07405 if ((header.headerType & 0x02) == 0) { 07406 return DRFLAC_FALSE; 07407 } 07408 07409 /* Check if it's a FLAC header. */ 07410 pageBodySize = drflac_ogg__get_page_body_size(&header); 07411 if (pageBodySize == 51) { /* 51 = the lacing value of the FLAC header packet. */ 07412 /* It could be a FLAC page... */ 07413 drflac_uint32 bytesRemainingInPage = pageBodySize; 07414 drflac_uint8 packetType; 07415 07416 if (onRead(pUserData, &packetType, 1) != 1) { 07417 return DRFLAC_FALSE; 07418 } 07419 07420 bytesRemainingInPage -= 1; 07421 if (packetType == 0x7F) { 07422 /* Increasingly more likely to be a FLAC page... */ 07423 drflac_uint8 sig[4]; 07424 if (onRead(pUserData, sig, 4) != 4) { 07425 return DRFLAC_FALSE; 07426 } 07427 07428 bytesRemainingInPage -= 4; 07429 if (sig[0] == 'F' && sig[1] == 'L' && sig[2] == 'A' && sig[3] == 'C') { 07430 /* Almost certainly a FLAC page... */ 07431 drflac_uint8 mappingVersion[2]; 07432 if (onRead(pUserData, mappingVersion, 2) != 2) { 07433 return DRFLAC_FALSE; 07434 } 07435 07436 if (mappingVersion[0] != 1) { 07437 return DRFLAC_FALSE; /* Only supporting version 1.x of the Ogg mapping. */ 07438 } 07439 07440 /* 07441 The next 2 bytes are the non-audio packets, not including this one. We don't care about this because we're going to 07442 be handling it in a generic way based on the serial number and packet types. 07443 */ 07444 if (!onSeek(pUserData, 2, drflac_seek_origin_current)) { 07445 return DRFLAC_FALSE; 07446 } 07447 07448 /* Expecting the native FLAC signature "fLaC". */ 07449 if (onRead(pUserData, sig, 4) != 4) { 07450 return DRFLAC_FALSE; 07451 } 07452 07453 if (sig[0] == 'f' && sig[1] == 'L' && sig[2] == 'a' && sig[3] == 'C') { 07454 /* The remaining data in the page should be the STREAMINFO block. */ 07455 drflac_streaminfo streaminfo; 07456 drflac_uint8 isLastBlock; 07457 drflac_uint8 blockType; 07458 drflac_uint32 blockSize; 07459 if (!drflac__read_and_decode_block_header(onRead, pUserData, &isLastBlock, &blockType, &blockSize)) { 07460 return DRFLAC_FALSE; 07461 } 07462 07463 if (blockType != DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO || blockSize != 34) { 07464 return DRFLAC_FALSE; /* Invalid block type. First block must be the STREAMINFO block. */ 07465 } 07466 07467 if (drflac__read_streaminfo(onRead, pUserData, &streaminfo)) { 07468 /* Success! */ 07469 pInit->hasStreamInfoBlock = DRFLAC_TRUE; 07470 pInit->sampleRate = streaminfo.sampleRate; 07471 pInit->channels = streaminfo.channels; 07472 pInit->bitsPerSample = streaminfo.bitsPerSample; 07473 pInit->totalPCMFrameCount = streaminfo.totalPCMFrameCount; 07474 pInit->maxBlockSizeInPCMFrames = streaminfo.maxBlockSizeInPCMFrames; 07475 pInit->hasMetadataBlocks = !isLastBlock; 07476 07477 if (onMeta) { 07478 drflac_metadata metadata; 07479 metadata.type = DRFLAC_METADATA_BLOCK_TYPE_STREAMINFO; 07480 metadata.pRawData = NULL; 07481 metadata.rawDataSize = 0; 07482 metadata.data.streaminfo = streaminfo; 07483 onMeta(pUserDataMD, &metadata); 07484 } 07485 07486 pInit->runningFilePos += pageBodySize; 07487 pInit->oggFirstBytePos = pInit->runningFilePos - 79; /* Subtracting 79 will place us right on top of the "OggS" identifier of the FLAC bos page. */ 07488 pInit->oggSerial = header.serialNumber; 07489 pInit->oggBosHeader = header; 07490 break; 07491 } else { 07492 /* Failed to read STREAMINFO block. Aww, so close... */ 07493 return DRFLAC_FALSE; 07494 } 07495 } else { 07496 /* Invalid file. */ 07497 return DRFLAC_FALSE; 07498 } 07499 } else { 07500 /* Not a FLAC header. Skip it. */ 07501 if (!onSeek(pUserData, bytesRemainingInPage, drflac_seek_origin_current)) { 07502 return DRFLAC_FALSE; 07503 } 07504 } 07505 } else { 07506 /* Not a FLAC header. Seek past the entire page and move on to the next. */ 07507 if (!onSeek(pUserData, bytesRemainingInPage, drflac_seek_origin_current)) { 07508 return DRFLAC_FALSE; 07509 } 07510 } 07511 } else { 07512 if (!onSeek(pUserData, pageBodySize, drflac_seek_origin_current)) { 07513 return DRFLAC_FALSE; 07514 } 07515 } 07516 07517 pInit->runningFilePos += pageBodySize; 07518 07519 07520 /* Read the header of the next page. */ 07521 if (drflac_ogg__read_page_header(onRead, pUserData, &header, &bytesRead, &crc32) != DRFLAC_SUCCESS) { 07522 return DRFLAC_FALSE; 07523 } 07524 pInit->runningFilePos += bytesRead; 07525 } 07526 07527 /* 07528 If we get here it means we found a FLAC audio stream. We should be sitting on the first byte of the header of the next page. The next 07529 packets in the FLAC logical stream contain the metadata. The only thing left to do in the initialization phase for Ogg is to create the 07530 Ogg bistream object. 07531 */ 07532 pInit->hasMetadataBlocks = DRFLAC_TRUE; /* <-- Always have at least VORBIS_COMMENT metadata block. */ 07533 return DRFLAC_TRUE; 07534 } 07535 #endif 07536 07537 static drflac_bool32 drflac__init_private(drflac_init_info* pInit, drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, drflac_container container, void* pUserData, void* pUserDataMD) 07538 { 07539 drflac_bool32 relaxed; 07540 drflac_uint8 id[4]; 07541 07542 if (pInit == NULL || onRead == NULL || onSeek == NULL) { 07543 return DRFLAC_FALSE; 07544 } 07545 07546 DRFLAC_ZERO_MEMORY(pInit, sizeof(*pInit)); 07547 pInit->onRead = onRead; 07548 pInit->onSeek = onSeek; 07549 pInit->onMeta = onMeta; 07550 pInit->container = container; 07551 pInit->pUserData = pUserData; 07552 pInit->pUserDataMD = pUserDataMD; 07553 07554 pInit->bs.onRead = onRead; 07555 pInit->bs.onSeek = onSeek; 07556 pInit->bs.pUserData = pUserData; 07557 drflac__reset_cache(&pInit->bs); 07558 07559 07560 /* If the container is explicitly defined then we can try opening in relaxed mode. */ 07561 relaxed = container != drflac_container_unknown; 07562 07563 /* Skip over any ID3 tags. */ 07564 for (;;) { 07565 if (onRead(pUserData, id, 4) != 4) { 07566 return DRFLAC_FALSE; /* Ran out of data. */ 07567 } 07568 pInit->runningFilePos += 4; 07569 07570 if (id[0] == 'I' && id[1] == 'D' && id[2] == '3') { 07571 drflac_uint8 header[6]; 07572 drflac_uint8 flags; 07573 drflac_uint32 headerSize; 07574 07575 if (onRead(pUserData, header, 6) != 6) { 07576 return DRFLAC_FALSE; /* Ran out of data. */ 07577 } 07578 pInit->runningFilePos += 6; 07579 07580 flags = header[1]; 07581 07582 DRFLAC_COPY_MEMORY(&headerSize, header+2, 4); 07583 headerSize = drflac__unsynchsafe_32(drflac__be2host_32(headerSize)); 07584 if (flags & 0x10) { 07585 headerSize += 10; 07586 } 07587 07588 if (!onSeek(pUserData, headerSize, drflac_seek_origin_current)) { 07589 return DRFLAC_FALSE; /* Failed to seek past the tag. */ 07590 } 07591 pInit->runningFilePos += headerSize; 07592 } else { 07593 break; 07594 } 07595 } 07596 07597 if (id[0] == 'f' && id[1] == 'L' && id[2] == 'a' && id[3] == 'C') { 07598 return drflac__init_private__native(pInit, onRead, onSeek, onMeta, pUserData, pUserDataMD, relaxed); 07599 } 07600 #ifndef DR_FLAC_NO_OGG 07601 if (id[0] == 'O' && id[1] == 'g' && id[2] == 'g' && id[3] == 'S') { 07602 return drflac__init_private__ogg(pInit, onRead, onSeek, onMeta, pUserData, pUserDataMD, relaxed); 07603 } 07604 #endif 07605 07606 /* If we get here it means we likely don't have a header. Try opening in relaxed mode, if applicable. */ 07607 if (relaxed) { 07608 if (container == drflac_container_native) { 07609 return drflac__init_private__native(pInit, onRead, onSeek, onMeta, pUserData, pUserDataMD, relaxed); 07610 } 07611 #ifndef DR_FLAC_NO_OGG 07612 if (container == drflac_container_ogg) { 07613 return drflac__init_private__ogg(pInit, onRead, onSeek, onMeta, pUserData, pUserDataMD, relaxed); 07614 } 07615 #endif 07616 } 07617 07618 /* Unsupported container. */ 07619 return DRFLAC_FALSE; 07620 } 07621 07622 static void drflac__init_from_info(drflac* pFlac, const drflac_init_info* pInit) 07623 { 07624 DRFLAC_ASSERT(pFlac != NULL); 07625 DRFLAC_ASSERT(pInit != NULL); 07626 07627 DRFLAC_ZERO_MEMORY(pFlac, sizeof(*pFlac)); 07628 pFlac->bs = pInit->bs; 07629 pFlac->onMeta = pInit->onMeta; 07630 pFlac->pUserDataMD = pInit->pUserDataMD; 07631 pFlac->maxBlockSizeInPCMFrames = pInit->maxBlockSizeInPCMFrames; 07632 pFlac->sampleRate = pInit->sampleRate; 07633 pFlac->channels = (drflac_uint8)pInit->channels; 07634 pFlac->bitsPerSample = (drflac_uint8)pInit->bitsPerSample; 07635 pFlac->totalPCMFrameCount = pInit->totalPCMFrameCount; 07636 pFlac->container = pInit->container; 07637 } 07638 07639 07640 static drflac* drflac_open_with_metadata_private(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, drflac_container container, void* pUserData, void* pUserDataMD, const drflac_allocation_callbacks* pAllocationCallbacks) 07641 { 07642 drflac_init_info init; 07643 drflac_uint32 allocationSize; 07644 drflac_uint32 wholeSIMDVectorCountPerChannel; 07645 drflac_uint32 decodedSamplesAllocationSize; 07646 #ifndef DR_FLAC_NO_OGG 07647 drflac_oggbs oggbs; 07648 #endif 07649 drflac_uint64 firstFramePos; 07650 drflac_uint64 seektablePos; 07651 drflac_uint32 seektableSize; 07652 drflac_allocation_callbacks allocationCallbacks; 07653 drflac* pFlac; 07654 07655 /* CPU support first. */ 07656 drflac__init_cpu_caps(); 07657 07658 if (!drflac__init_private(&init, onRead, onSeek, onMeta, container, pUserData, pUserDataMD)) { 07659 return NULL; 07660 } 07661 07662 if (pAllocationCallbacks != NULL) { 07663 allocationCallbacks = *pAllocationCallbacks; 07664 if (allocationCallbacks.onFree == NULL || (allocationCallbacks.onMalloc == NULL && allocationCallbacks.onRealloc == NULL)) { 07665 return NULL; /* Invalid allocation callbacks. */ 07666 } 07667 } else { 07668 allocationCallbacks.pUserData = NULL; 07669 allocationCallbacks.onMalloc = drflac__malloc_default; 07670 allocationCallbacks.onRealloc = drflac__realloc_default; 07671 allocationCallbacks.onFree = drflac__free_default; 07672 } 07673 07674 07675 /* 07676 The size of the allocation for the drflac object needs to be large enough to fit the following: 07677 1) The main members of the drflac structure 07678 2) A block of memory large enough to store the decoded samples of the largest frame in the stream 07679 3) If the container is Ogg, a drflac_oggbs object 07680 07681 The complicated part of the allocation is making sure there's enough room the decoded samples, taking into consideration 07682 the different SIMD instruction sets. 07683 */ 07684 allocationSize = sizeof(drflac); 07685 07686 /* 07687 The allocation size for decoded frames depends on the number of 32-bit integers that fit inside the largest SIMD vector 07688 we are supporting. 07689 */ 07690 if ((init.maxBlockSizeInPCMFrames % (DRFLAC_MAX_SIMD_VECTOR_SIZE / sizeof(drflac_int32))) == 0) { 07691 wholeSIMDVectorCountPerChannel = (init.maxBlockSizeInPCMFrames / (DRFLAC_MAX_SIMD_VECTOR_SIZE / sizeof(drflac_int32))); 07692 } else { 07693 wholeSIMDVectorCountPerChannel = (init.maxBlockSizeInPCMFrames / (DRFLAC_MAX_SIMD_VECTOR_SIZE / sizeof(drflac_int32))) + 1; 07694 } 07695 07696 decodedSamplesAllocationSize = wholeSIMDVectorCountPerChannel * DRFLAC_MAX_SIMD_VECTOR_SIZE * init.channels; 07697 07698 allocationSize += decodedSamplesAllocationSize; 07699 allocationSize += DRFLAC_MAX_SIMD_VECTOR_SIZE; /* Allocate extra bytes to ensure we have enough for alignment. */ 07700 07701 #ifndef DR_FLAC_NO_OGG 07702 /* There's additional data required for Ogg streams. */ 07703 if (init.container == drflac_container_ogg) { 07704 allocationSize += sizeof(drflac_oggbs); 07705 } 07706 07707 DRFLAC_ZERO_MEMORY(&oggbs, sizeof(oggbs)); 07708 if (init.container == drflac_container_ogg) { 07709 oggbs.onRead = onRead; 07710 oggbs.onSeek = onSeek; 07711 oggbs.pUserData = pUserData; 07712 oggbs.currentBytePos = init.oggFirstBytePos; 07713 oggbs.firstBytePos = init.oggFirstBytePos; 07714 oggbs.serialNumber = init.oggSerial; 07715 oggbs.bosPageHeader = init.oggBosHeader; 07716 oggbs.bytesRemainingInPage = 0; 07717 } 07718 #endif 07719 07720 /* 07721 This part is a bit awkward. We need to load the seektable so that it can be referenced in-memory, but I want the drflac object to 07722 consist of only a single heap allocation. To this, the size of the seek table needs to be known, which we determine when reading 07723 and decoding the metadata. 07724 */ 07725 firstFramePos = 42; /* <-- We know we are at byte 42 at this point. */ 07726 seektablePos = 0; 07727 seektableSize = 0; 07728 if (init.hasMetadataBlocks) { 07729 drflac_read_proc onReadOverride = onRead; 07730 drflac_seek_proc onSeekOverride = onSeek; 07731 void* pUserDataOverride = pUserData; 07732 07733 #ifndef DR_FLAC_NO_OGG 07734 if (init.container == drflac_container_ogg) { 07735 onReadOverride = drflac__on_read_ogg; 07736 onSeekOverride = drflac__on_seek_ogg; 07737 pUserDataOverride = (void*)&oggbs; 07738 } 07739 #endif 07740 07741 if (!drflac__read_and_decode_metadata(onReadOverride, onSeekOverride, onMeta, pUserDataOverride, pUserDataMD, &firstFramePos, &seektablePos, &seektableSize, &allocationCallbacks)) { 07742 return NULL; 07743 } 07744 07745 allocationSize += seektableSize; 07746 } 07747 07748 07749 pFlac = (drflac*)drflac__malloc_from_callbacks(allocationSize, &allocationCallbacks); 07750 if (pFlac == NULL) { 07751 return NULL; 07752 } 07753 07754 drflac__init_from_info(pFlac, &init); 07755 pFlac->allocationCallbacks = allocationCallbacks; 07756 pFlac->pDecodedSamples = (drflac_int32*)drflac_align((size_t)pFlac->pExtraData, DRFLAC_MAX_SIMD_VECTOR_SIZE); 07757 07758 #ifndef DR_FLAC_NO_OGG 07759 if (init.container == drflac_container_ogg) { 07760 drflac_oggbs* pInternalOggbs = (drflac_oggbs*)((drflac_uint8*)pFlac->pDecodedSamples + decodedSamplesAllocationSize + seektableSize); 07761 *pInternalOggbs = oggbs; 07762 07763 /* The Ogg bistream needs to be layered on top of the original bitstream. */ 07764 pFlac->bs.onRead = drflac__on_read_ogg; 07765 pFlac->bs.onSeek = drflac__on_seek_ogg; 07766 pFlac->bs.pUserData = (void*)pInternalOggbs; 07767 pFlac->_oggbs = (void*)pInternalOggbs; 07768 } 07769 #endif 07770 07771 pFlac->firstFLACFramePosInBytes = firstFramePos; 07772 07773 /* NOTE: Seektables are not currently compatible with Ogg encapsulation (Ogg has its own accelerated seeking system). I may change this later, so I'm leaving this here for now. */ 07774 #ifndef DR_FLAC_NO_OGG 07775 if (init.container == drflac_container_ogg) 07776 { 07777 pFlac->pSeekpoints = NULL; 07778 pFlac->seekpointCount = 0; 07779 } 07780 else 07781 #endif 07782 { 07783 /* If we have a seektable we need to load it now, making sure we move back to where we were previously. */ 07784 if (seektablePos != 0) { 07785 pFlac->seekpointCount = seektableSize / sizeof(*pFlac->pSeekpoints); 07786 pFlac->pSeekpoints = (drflac_seekpoint*)((drflac_uint8*)pFlac->pDecodedSamples + decodedSamplesAllocationSize); 07787 07788 DRFLAC_ASSERT(pFlac->bs.onSeek != NULL); 07789 DRFLAC_ASSERT(pFlac->bs.onRead != NULL); 07790 07791 /* Seek to the seektable, then just read directly into our seektable buffer. */ 07792 if (pFlac->bs.onSeek(pFlac->bs.pUserData, (int)seektablePos, drflac_seek_origin_start)) { 07793 if (pFlac->bs.onRead(pFlac->bs.pUserData, pFlac->pSeekpoints, seektableSize) == seektableSize) { 07794 /* Endian swap. */ 07795 drflac_uint32 iSeekpoint; 07796 for (iSeekpoint = 0; iSeekpoint < pFlac->seekpointCount; ++iSeekpoint) { 07797 pFlac->pSeekpoints[iSeekpoint].firstPCMFrame = drflac__be2host_64(pFlac->pSeekpoints[iSeekpoint].firstPCMFrame); 07798 pFlac->pSeekpoints[iSeekpoint].flacFrameOffset = drflac__be2host_64(pFlac->pSeekpoints[iSeekpoint].flacFrameOffset); 07799 pFlac->pSeekpoints[iSeekpoint].pcmFrameCount = drflac__be2host_16(pFlac->pSeekpoints[iSeekpoint].pcmFrameCount); 07800 } 07801 } else { 07802 /* Failed to read the seektable. Pretend we don't have one. */ 07803 pFlac->pSeekpoints = NULL; 07804 pFlac->seekpointCount = 0; 07805 } 07806 07807 /* We need to seek back to where we were. If this fails it's a critical error. */ 07808 if (!pFlac->bs.onSeek(pFlac->bs.pUserData, (int)pFlac->firstFLACFramePosInBytes, drflac_seek_origin_start)) { 07809 drflac__free_from_callbacks(pFlac, &allocationCallbacks); 07810 return NULL; 07811 } 07812 } else { 07813 /* Failed to seek to the seektable. Ominous sign, but for now we can just pretend we don't have one. */ 07814 pFlac->pSeekpoints = NULL; 07815 pFlac->seekpointCount = 0; 07816 } 07817 } 07818 } 07819 07820 07821 /* 07822 If we get here, but don't have a STREAMINFO block, it means we've opened the stream in relaxed mode and need to decode 07823 the first frame. 07824 */ 07825 if (!init.hasStreamInfoBlock) { 07826 pFlac->currentFLACFrame.header = init.firstFrameHeader; 07827 for (;;) { 07828 drflac_result result = drflac__decode_flac_frame(pFlac); 07829 if (result == DRFLAC_SUCCESS) { 07830 break; 07831 } else { 07832 if (result == DRFLAC_CRC_MISMATCH) { 07833 if (!drflac__read_next_flac_frame_header(&pFlac->bs, pFlac->bitsPerSample, &pFlac->currentFLACFrame.header)) { 07834 drflac__free_from_callbacks(pFlac, &allocationCallbacks); 07835 return NULL; 07836 } 07837 continue; 07838 } else { 07839 drflac__free_from_callbacks(pFlac, &allocationCallbacks); 07840 return NULL; 07841 } 07842 } 07843 } 07844 } 07845 07846 return pFlac; 07847 } 07848 07849 07850 07851 #ifndef DR_FLAC_NO_STDIO 07852 #include <stdio.h> 07853 #include <wchar.h> /* For wcslen(), wcsrtombs() */ 07854 07855 /* drflac_result_from_errno() is only used for fopen() and wfopen() so putting it inside DR_WAV_NO_STDIO for now. If something else needs this later we can move it out. */ 07856 #include <errno.h> 07857 static drflac_result drflac_result_from_errno(int e) 07858 { 07859 switch (e) 07860 { 07861 case 0: return DRFLAC_SUCCESS; 07862 #ifdef EPERM 07863 case EPERM: return DRFLAC_INVALID_OPERATION; 07864 #endif 07865 #ifdef ENOENT 07866 case ENOENT: return DRFLAC_DOES_NOT_EXIST; 07867 #endif 07868 #ifdef ESRCH 07869 case ESRCH: return DRFLAC_DOES_NOT_EXIST; 07870 #endif 07871 #ifdef EINTR 07872 case EINTR: return DRFLAC_INTERRUPT; 07873 #endif 07874 #ifdef EIO 07875 case EIO: return DRFLAC_IO_ERROR; 07876 #endif 07877 #ifdef ENXIO 07878 case ENXIO: return DRFLAC_DOES_NOT_EXIST; 07879 #endif 07880 #ifdef E2BIG 07881 case E2BIG: return DRFLAC_INVALID_ARGS; 07882 #endif 07883 #ifdef ENOEXEC 07884 case ENOEXEC: return DRFLAC_INVALID_FILE; 07885 #endif 07886 #ifdef EBADF 07887 case EBADF: return DRFLAC_INVALID_FILE; 07888 #endif 07889 #ifdef ECHILD 07890 case ECHILD: return DRFLAC_ERROR; 07891 #endif 07892 #ifdef EAGAIN 07893 case EAGAIN: return DRFLAC_UNAVAILABLE; 07894 #endif 07895 #ifdef ENOMEM 07896 case ENOMEM: return DRFLAC_OUT_OF_MEMORY; 07897 #endif 07898 #ifdef EACCES 07899 case EACCES: return DRFLAC_ACCESS_DENIED; 07900 #endif 07901 #ifdef EFAULT 07902 case EFAULT: return DRFLAC_BAD_ADDRESS; 07903 #endif 07904 #ifdef ENOTBLK 07905 case ENOTBLK: return DRFLAC_ERROR; 07906 #endif 07907 #ifdef EBUSY 07908 case EBUSY: return DRFLAC_BUSY; 07909 #endif 07910 #ifdef EEXIST 07911 case EEXIST: return DRFLAC_ALREADY_EXISTS; 07912 #endif 07913 #ifdef EXDEV 07914 case EXDEV: return DRFLAC_ERROR; 07915 #endif 07916 #ifdef ENODEV 07917 case ENODEV: return DRFLAC_DOES_NOT_EXIST; 07918 #endif 07919 #ifdef ENOTDIR 07920 case ENOTDIR: return DRFLAC_NOT_DIRECTORY; 07921 #endif 07922 #ifdef EISDIR 07923 case EISDIR: return DRFLAC_IS_DIRECTORY; 07924 #endif 07925 #ifdef EINVAL 07926 case EINVAL: return DRFLAC_INVALID_ARGS; 07927 #endif 07928 #ifdef ENFILE 07929 case ENFILE: return DRFLAC_TOO_MANY_OPEN_FILES; 07930 #endif 07931 #ifdef EMFILE 07932 case EMFILE: return DRFLAC_TOO_MANY_OPEN_FILES; 07933 #endif 07934 #ifdef ENOTTY 07935 case ENOTTY: return DRFLAC_INVALID_OPERATION; 07936 #endif 07937 #ifdef ETXTBSY 07938 case ETXTBSY: return DRFLAC_BUSY; 07939 #endif 07940 #ifdef EFBIG 07941 case EFBIG: return DRFLAC_TOO_BIG; 07942 #endif 07943 #ifdef ENOSPC 07944 case ENOSPC: return DRFLAC_NO_SPACE; 07945 #endif 07946 #ifdef ESPIPE 07947 case ESPIPE: return DRFLAC_BAD_SEEK; 07948 #endif 07949 #ifdef EROFS 07950 case EROFS: return DRFLAC_ACCESS_DENIED; 07951 #endif 07952 #ifdef EMLINK 07953 case EMLINK: return DRFLAC_TOO_MANY_LINKS; 07954 #endif 07955 #ifdef EPIPE 07956 case EPIPE: return DRFLAC_BAD_PIPE; 07957 #endif 07958 #ifdef EDOM 07959 case EDOM: return DRFLAC_OUT_OF_RANGE; 07960 #endif 07961 #ifdef ERANGE 07962 case ERANGE: return DRFLAC_OUT_OF_RANGE; 07963 #endif 07964 #ifdef EDEADLK 07965 case EDEADLK: return DRFLAC_DEADLOCK; 07966 #endif 07967 #ifdef ENAMETOOLONG 07968 case ENAMETOOLONG: return DRFLAC_PATH_TOO_LONG; 07969 #endif 07970 #ifdef ENOLCK 07971 case ENOLCK: return DRFLAC_ERROR; 07972 #endif 07973 #ifdef ENOSYS 07974 case ENOSYS: return DRFLAC_NOT_IMPLEMENTED; 07975 #endif 07976 #ifdef ENOTEMPTY 07977 case ENOTEMPTY: return DRFLAC_DIRECTORY_NOT_EMPTY; 07978 #endif 07979 #ifdef ELOOP 07980 case ELOOP: return DRFLAC_TOO_MANY_LINKS; 07981 #endif 07982 #ifdef ENOMSG 07983 case ENOMSG: return DRFLAC_NO_MESSAGE; 07984 #endif 07985 #ifdef EIDRM 07986 case EIDRM: return DRFLAC_ERROR; 07987 #endif 07988 #ifdef ECHRNG 07989 case ECHRNG: return DRFLAC_ERROR; 07990 #endif 07991 #ifdef EL2NSYNC 07992 case EL2NSYNC: return DRFLAC_ERROR; 07993 #endif 07994 #ifdef EL3HLT 07995 case EL3HLT: return DRFLAC_ERROR; 07996 #endif 07997 #ifdef EL3RST 07998 case EL3RST: return DRFLAC_ERROR; 07999 #endif 08000 #ifdef ELNRNG 08001 case ELNRNG: return DRFLAC_OUT_OF_RANGE; 08002 #endif 08003 #ifdef EUNATCH 08004 case EUNATCH: return DRFLAC_ERROR; 08005 #endif 08006 #ifdef ENOCSI 08007 case ENOCSI: return DRFLAC_ERROR; 08008 #endif 08009 #ifdef EL2HLT 08010 case EL2HLT: return DRFLAC_ERROR; 08011 #endif 08012 #ifdef EBADE 08013 case EBADE: return DRFLAC_ERROR; 08014 #endif 08015 #ifdef EBADR 08016 case EBADR: return DRFLAC_ERROR; 08017 #endif 08018 #ifdef EXFULL 08019 case EXFULL: return DRFLAC_ERROR; 08020 #endif 08021 #ifdef ENOANO 08022 case ENOANO: return DRFLAC_ERROR; 08023 #endif 08024 #ifdef EBADRQC 08025 case EBADRQC: return DRFLAC_ERROR; 08026 #endif 08027 #ifdef EBADSLT 08028 case EBADSLT: return DRFLAC_ERROR; 08029 #endif 08030 #ifdef EBFONT 08031 case EBFONT: return DRFLAC_INVALID_FILE; 08032 #endif 08033 #ifdef ENOSTR 08034 case ENOSTR: return DRFLAC_ERROR; 08035 #endif 08036 #ifdef ENODATA 08037 case ENODATA: return DRFLAC_NO_DATA_AVAILABLE; 08038 #endif 08039 #ifdef ETIME 08040 case ETIME: return DRFLAC_TIMEOUT; 08041 #endif 08042 #ifdef ENOSR 08043 case ENOSR: return DRFLAC_NO_DATA_AVAILABLE; 08044 #endif 08045 #ifdef ENONET 08046 case ENONET: return DRFLAC_NO_NETWORK; 08047 #endif 08048 #ifdef ENOPKG 08049 case ENOPKG: return DRFLAC_ERROR; 08050 #endif 08051 #ifdef EREMOTE 08052 case EREMOTE: return DRFLAC_ERROR; 08053 #endif 08054 #ifdef ENOLINK 08055 case ENOLINK: return DRFLAC_ERROR; 08056 #endif 08057 #ifdef EADV 08058 case EADV: return DRFLAC_ERROR; 08059 #endif 08060 #ifdef ESRMNT 08061 case ESRMNT: return DRFLAC_ERROR; 08062 #endif 08063 #ifdef ECOMM 08064 case ECOMM: return DRFLAC_ERROR; 08065 #endif 08066 #ifdef EPROTO 08067 case EPROTO: return DRFLAC_ERROR; 08068 #endif 08069 #ifdef EMULTIHOP 08070 case EMULTIHOP: return DRFLAC_ERROR; 08071 #endif 08072 #ifdef EDOTDOT 08073 case EDOTDOT: return DRFLAC_ERROR; 08074 #endif 08075 #ifdef EBADMSG 08076 case EBADMSG: return DRFLAC_BAD_MESSAGE; 08077 #endif 08078 #ifdef EOVERFLOW 08079 case EOVERFLOW: return DRFLAC_TOO_BIG; 08080 #endif 08081 #ifdef ENOTUNIQ 08082 case ENOTUNIQ: return DRFLAC_NOT_UNIQUE; 08083 #endif 08084 #ifdef EBADFD 08085 case EBADFD: return DRFLAC_ERROR; 08086 #endif 08087 #ifdef EREMCHG 08088 case EREMCHG: return DRFLAC_ERROR; 08089 #endif 08090 #ifdef ELIBACC 08091 case ELIBACC: return DRFLAC_ACCESS_DENIED; 08092 #endif 08093 #ifdef ELIBBAD 08094 case ELIBBAD: return DRFLAC_INVALID_FILE; 08095 #endif 08096 #ifdef ELIBSCN 08097 case ELIBSCN: return DRFLAC_INVALID_FILE; 08098 #endif 08099 #ifdef ELIBMAX 08100 case ELIBMAX: return DRFLAC_ERROR; 08101 #endif 08102 #ifdef ELIBEXEC 08103 case ELIBEXEC: return DRFLAC_ERROR; 08104 #endif 08105 #ifdef EILSEQ 08106 case EILSEQ: return DRFLAC_INVALID_DATA; 08107 #endif 08108 #ifdef ERESTART 08109 case ERESTART: return DRFLAC_ERROR; 08110 #endif 08111 #ifdef ESTRPIPE 08112 case ESTRPIPE: return DRFLAC_ERROR; 08113 #endif 08114 #ifdef EUSERS 08115 case EUSERS: return DRFLAC_ERROR; 08116 #endif 08117 #ifdef ENOTSOCK 08118 case ENOTSOCK: return DRFLAC_NOT_SOCKET; 08119 #endif 08120 #ifdef EDESTADDRREQ 08121 case EDESTADDRREQ: return DRFLAC_NO_ADDRESS; 08122 #endif 08123 #ifdef EMSGSIZE 08124 case EMSGSIZE: return DRFLAC_TOO_BIG; 08125 #endif 08126 #ifdef EPROTOTYPE 08127 case EPROTOTYPE: return DRFLAC_BAD_PROTOCOL; 08128 #endif 08129 #ifdef ENOPROTOOPT 08130 case ENOPROTOOPT: return DRFLAC_PROTOCOL_UNAVAILABLE; 08131 #endif 08132 #ifdef EPROTONOSUPPORT 08133 case EPROTONOSUPPORT: return DRFLAC_PROTOCOL_NOT_SUPPORTED; 08134 #endif 08135 #ifdef ESOCKTNOSUPPORT 08136 case ESOCKTNOSUPPORT: return DRFLAC_SOCKET_NOT_SUPPORTED; 08137 #endif 08138 #ifdef EOPNOTSUPP 08139 case EOPNOTSUPP: return DRFLAC_INVALID_OPERATION; 08140 #endif 08141 #ifdef EPFNOSUPPORT 08142 case EPFNOSUPPORT: return DRFLAC_PROTOCOL_FAMILY_NOT_SUPPORTED; 08143 #endif 08144 #ifdef EAFNOSUPPORT 08145 case EAFNOSUPPORT: return DRFLAC_ADDRESS_FAMILY_NOT_SUPPORTED; 08146 #endif 08147 #ifdef EADDRINUSE 08148 case EADDRINUSE: return DRFLAC_ALREADY_IN_USE; 08149 #endif 08150 #ifdef EADDRNOTAVAIL 08151 case EADDRNOTAVAIL: return DRFLAC_ERROR; 08152 #endif 08153 #ifdef ENETDOWN 08154 case ENETDOWN: return DRFLAC_NO_NETWORK; 08155 #endif 08156 #ifdef ENETUNREACH 08157 case ENETUNREACH: return DRFLAC_NO_NETWORK; 08158 #endif 08159 #ifdef ENETRESET 08160 case ENETRESET: return DRFLAC_NO_NETWORK; 08161 #endif 08162 #ifdef ECONNABORTED 08163 case ECONNABORTED: return DRFLAC_NO_NETWORK; 08164 #endif 08165 #ifdef ECONNRESET 08166 case ECONNRESET: return DRFLAC_CONNECTION_RESET; 08167 #endif 08168 #ifdef ENOBUFS 08169 case ENOBUFS: return DRFLAC_NO_SPACE; 08170 #endif 08171 #ifdef EISCONN 08172 case EISCONN: return DRFLAC_ALREADY_CONNECTED; 08173 #endif 08174 #ifdef ENOTCONN 08175 case ENOTCONN: return DRFLAC_NOT_CONNECTED; 08176 #endif 08177 #ifdef ESHUTDOWN 08178 case ESHUTDOWN: return DRFLAC_ERROR; 08179 #endif 08180 #ifdef ETOOMANYREFS 08181 case ETOOMANYREFS: return DRFLAC_ERROR; 08182 #endif 08183 #ifdef ETIMEDOUT 08184 case ETIMEDOUT: return DRFLAC_TIMEOUT; 08185 #endif 08186 #ifdef ECONNREFUSED 08187 case ECONNREFUSED: return DRFLAC_CONNECTION_REFUSED; 08188 #endif 08189 #ifdef EHOSTDOWN 08190 case EHOSTDOWN: return DRFLAC_NO_HOST; 08191 #endif 08192 #ifdef EHOSTUNREACH 08193 case EHOSTUNREACH: return DRFLAC_NO_HOST; 08194 #endif 08195 #ifdef EALREADY 08196 case EALREADY: return DRFLAC_IN_PROGRESS; 08197 #endif 08198 #ifdef EINPROGRESS 08199 case EINPROGRESS: return DRFLAC_IN_PROGRESS; 08200 #endif 08201 #ifdef ESTALE 08202 case ESTALE: return DRFLAC_INVALID_FILE; 08203 #endif 08204 #ifdef EUCLEAN 08205 case EUCLEAN: return DRFLAC_ERROR; 08206 #endif 08207 #ifdef ENOTNAM 08208 case ENOTNAM: return DRFLAC_ERROR; 08209 #endif 08210 #ifdef ENAVAIL 08211 case ENAVAIL: return DRFLAC_ERROR; 08212 #endif 08213 #ifdef EISNAM 08214 case EISNAM: return DRFLAC_ERROR; 08215 #endif 08216 #ifdef EREMOTEIO 08217 case EREMOTEIO: return DRFLAC_IO_ERROR; 08218 #endif 08219 #ifdef EDQUOT 08220 case EDQUOT: return DRFLAC_NO_SPACE; 08221 #endif 08222 #ifdef ENOMEDIUM 08223 case ENOMEDIUM: return DRFLAC_DOES_NOT_EXIST; 08224 #endif 08225 #ifdef EMEDIUMTYPE 08226 case EMEDIUMTYPE: return DRFLAC_ERROR; 08227 #endif 08228 #ifdef ECANCELED 08229 case ECANCELED: return DRFLAC_CANCELLED; 08230 #endif 08231 #ifdef ENOKEY 08232 case ENOKEY: return DRFLAC_ERROR; 08233 #endif 08234 #ifdef EKEYEXPIRED 08235 case EKEYEXPIRED: return DRFLAC_ERROR; 08236 #endif 08237 #ifdef EKEYREVOKED 08238 case EKEYREVOKED: return DRFLAC_ERROR; 08239 #endif 08240 #ifdef EKEYREJECTED 08241 case EKEYREJECTED: return DRFLAC_ERROR; 08242 #endif 08243 #ifdef EOWNERDEAD 08244 case EOWNERDEAD: return DRFLAC_ERROR; 08245 #endif 08246 #ifdef ENOTRECOVERABLE 08247 case ENOTRECOVERABLE: return DRFLAC_ERROR; 08248 #endif 08249 #ifdef ERFKILL 08250 case ERFKILL: return DRFLAC_ERROR; 08251 #endif 08252 #ifdef EHWPOISON 08253 case EHWPOISON: return DRFLAC_ERROR; 08254 #endif 08255 default: return DRFLAC_ERROR; 08256 } 08257 } 08258 08259 static drflac_result drflac_fopen(FILE** ppFile, const char* pFilePath, const char* pOpenMode) 08260 { 08261 #if _MSC_VER && _MSC_VER >= 1400 08262 errno_t err; 08263 #endif 08264 08265 if (ppFile != NULL) { 08266 *ppFile = NULL; /* Safety. */ 08267 } 08268 08269 if (pFilePath == NULL || pOpenMode == NULL || ppFile == NULL) { 08270 return DRFLAC_INVALID_ARGS; 08271 } 08272 08273 #if _MSC_VER && _MSC_VER >= 1400 08274 err = fopen_s(ppFile, pFilePath, pOpenMode); 08275 if (err != 0) { 08276 return drflac_result_from_errno(err); 08277 } 08278 #else 08279 #if defined(_WIN32) || defined(__APPLE__) 08280 *ppFile = fopen(pFilePath, pOpenMode); 08281 #else 08282 #if defined(_FILE_OFFSET_BITS) && _FILE_OFFSET_BITS == 64 && defined(_LARGEFILE64_SOURCE) 08283 *ppFile = fopen64(pFilePath, pOpenMode); 08284 #else 08285 *ppFile = fopen(pFilePath, pOpenMode); 08286 #endif 08287 #endif 08288 if (*ppFile == NULL) { 08289 drflac_result result = drflac_result_from_errno(errno); 08290 if (result == DRFLAC_SUCCESS) { 08291 result = DRFLAC_ERROR; /* Just a safety check to make sure we never ever return success when pFile == NULL. */ 08292 } 08293 08294 return result; 08295 } 08296 #endif 08297 08298 return DRFLAC_SUCCESS; 08299 } 08300 08301 /* 08302 _wfopen() isn't always available in all compilation environments. 08303 08304 * Windows only. 08305 * MSVC seems to support it universally as far back as VC6 from what I can tell (haven't checked further back). 08306 * MinGW-64 (both 32- and 64-bit) seems to support it. 08307 * MinGW wraps it in !defined(__STRICT_ANSI__). 08308 08309 This can be reviewed as compatibility issues arise. The preference is to use _wfopen_s() and _wfopen() as opposed to the wcsrtombs() 08310 fallback, so if you notice your compiler not detecting this properly I'm happy to look at adding support. 08311 */ 08312 #if defined(_WIN32) 08313 #if defined(_MSC_VER) || defined(__MINGW64__) || !defined(__STRICT_ANSI__) 08314 #define DRFLAC_HAS_WFOPEN 08315 #endif 08316 #endif 08317 08318 static drflac_result drflac_wfopen(FILE** ppFile, const wchar_t* pFilePath, const wchar_t* pOpenMode, const drflac_allocation_callbacks* pAllocationCallbacks) 08319 { 08320 if (ppFile != NULL) { 08321 *ppFile = NULL; /* Safety. */ 08322 } 08323 08324 if (pFilePath == NULL || pOpenMode == NULL || ppFile == NULL) { 08325 return DRFLAC_INVALID_ARGS; 08326 } 08327 08328 #if defined(DRFLAC_HAS_WFOPEN) 08329 { 08330 /* Use _wfopen() on Windows. */ 08331 #if defined(_MSC_VER) && _MSC_VER >= 1400 08332 errno_t err = _wfopen_s(ppFile, pFilePath, pOpenMode); 08333 if (err != 0) { 08334 return drflac_result_from_errno(err); 08335 } 08336 #else 08337 *ppFile = _wfopen(pFilePath, pOpenMode); 08338 if (*ppFile == NULL) { 08339 return drflac_result_from_errno(errno); 08340 } 08341 #endif 08342 (void)pAllocationCallbacks; 08343 } 08344 #else 08345 /* 08346 Use fopen() on anything other than Windows. Requires a conversion. This is annoying because fopen() is locale specific. The only real way I can 08347 think of to do this is with wcsrtombs(). Note that wcstombs() is apparently not thread-safe because it uses a static global mbstate_t object for 08348 maintaining state. I've checked this with -std=c89 and it works, but if somebody get's a compiler error I'll look into improving compatibility. 08349 */ 08350 { 08351 mbstate_t mbs; 08352 size_t lenMB; 08353 const wchar_t* pFilePathTemp = pFilePath; 08354 char* pFilePathMB = NULL; 08355 char pOpenModeMB[32] = {0}; 08356 08357 /* Get the length first. */ 08358 DRFLAC_ZERO_OBJECT(&mbs); 08359 lenMB = wcsrtombs(NULL, &pFilePathTemp, 0, &mbs); 08360 if (lenMB == (size_t)-1) { 08361 return drflac_result_from_errno(errno); 08362 } 08363 08364 pFilePathMB = (char*)drflac__malloc_from_callbacks(lenMB + 1, pAllocationCallbacks); 08365 if (pFilePathMB == NULL) { 08366 return DRFLAC_OUT_OF_MEMORY; 08367 } 08368 08369 pFilePathTemp = pFilePath; 08370 DRFLAC_ZERO_OBJECT(&mbs); 08371 wcsrtombs(pFilePathMB, &pFilePathTemp, lenMB + 1, &mbs); 08372 08373 /* The open mode should always consist of ASCII characters so we should be able to do a trivial conversion. */ 08374 { 08375 size_t i = 0; 08376 for (;;) { 08377 if (pOpenMode[i] == 0) { 08378 pOpenModeMB[i] = '\0'; 08379 break; 08380 } 08381 08382 pOpenModeMB[i] = (char)pOpenMode[i]; 08383 i += 1; 08384 } 08385 } 08386 08387 *ppFile = fopen(pFilePathMB, pOpenModeMB); 08388 08389 drflac__free_from_callbacks(pFilePathMB, pAllocationCallbacks); 08390 } 08391 08392 if (*ppFile == NULL) { 08393 return DRFLAC_ERROR; 08394 } 08395 #endif 08396 08397 return DRFLAC_SUCCESS; 08398 } 08399 08400 static size_t drflac__on_read_stdio(void* pUserData, void* bufferOut, size_t bytesToRead) 08401 { 08402 return fread(bufferOut, 1, bytesToRead, (FILE*)pUserData); 08403 } 08404 08405 static drflac_bool32 drflac__on_seek_stdio(void* pUserData, int offset, drflac_seek_origin origin) 08406 { 08407 DRFLAC_ASSERT(offset >= 0); /* <-- Never seek backwards. */ 08408 08409 return fseek((FILE*)pUserData, offset, (origin == drflac_seek_origin_current) ? SEEK_CUR : SEEK_SET) == 0; 08410 } 08411 08412 08413 DRFLAC_API drflac* drflac_open_file(const char* pFileName, const drflac_allocation_callbacks* pAllocationCallbacks) 08414 { 08415 drflac* pFlac; 08416 FILE* pFile; 08417 08418 if (drflac_fopen(&pFile, pFileName, "rb") != DRFLAC_SUCCESS) { 08419 return NULL; 08420 } 08421 08422 pFlac = drflac_open(drflac__on_read_stdio, drflac__on_seek_stdio, (void*)pFile, pAllocationCallbacks); 08423 if (pFlac == NULL) { 08424 fclose(pFile); 08425 return NULL; 08426 } 08427 08428 return pFlac; 08429 } 08430 08431 DRFLAC_API drflac* drflac_open_file_w(const wchar_t* pFileName, const drflac_allocation_callbacks* pAllocationCallbacks) 08432 { 08433 drflac* pFlac; 08434 FILE* pFile; 08435 08436 if (drflac_wfopen(&pFile, pFileName, L"rb", pAllocationCallbacks) != DRFLAC_SUCCESS) { 08437 return NULL; 08438 } 08439 08440 pFlac = drflac_open(drflac__on_read_stdio, drflac__on_seek_stdio, (void*)pFile, pAllocationCallbacks); 08441 if (pFlac == NULL) { 08442 fclose(pFile); 08443 return NULL; 08444 } 08445 08446 return pFlac; 08447 } 08448 08449 DRFLAC_API drflac* drflac_open_file_with_metadata(const char* pFileName, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks) 08450 { 08451 drflac* pFlac; 08452 FILE* pFile; 08453 08454 if (drflac_fopen(&pFile, pFileName, "rb") != DRFLAC_SUCCESS) { 08455 return NULL; 08456 } 08457 08458 pFlac = drflac_open_with_metadata_private(drflac__on_read_stdio, drflac__on_seek_stdio, onMeta, drflac_container_unknown, (void*)pFile, pUserData, pAllocationCallbacks); 08459 if (pFlac == NULL) { 08460 fclose(pFile); 08461 return pFlac; 08462 } 08463 08464 return pFlac; 08465 } 08466 08467 DRFLAC_API drflac* drflac_open_file_with_metadata_w(const wchar_t* pFileName, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks) 08468 { 08469 drflac* pFlac; 08470 FILE* pFile; 08471 08472 if (drflac_wfopen(&pFile, pFileName, L"rb", pAllocationCallbacks) != DRFLAC_SUCCESS) { 08473 return NULL; 08474 } 08475 08476 pFlac = drflac_open_with_metadata_private(drflac__on_read_stdio, drflac__on_seek_stdio, onMeta, drflac_container_unknown, (void*)pFile, pUserData, pAllocationCallbacks); 08477 if (pFlac == NULL) { 08478 fclose(pFile); 08479 return pFlac; 08480 } 08481 08482 return pFlac; 08483 } 08484 #endif /* DR_FLAC_NO_STDIO */ 08485 08486 static size_t drflac__on_read_memory(void* pUserData, void* bufferOut, size_t bytesToRead) 08487 { 08488 drflac__memory_stream* memoryStream = (drflac__memory_stream*)pUserData; 08489 size_t bytesRemaining; 08490 08491 DRFLAC_ASSERT(memoryStream != NULL); 08492 DRFLAC_ASSERT(memoryStream->dataSize >= memoryStream->currentReadPos); 08493 08494 bytesRemaining = memoryStream->dataSize - memoryStream->currentReadPos; 08495 if (bytesToRead > bytesRemaining) { 08496 bytesToRead = bytesRemaining; 08497 } 08498 08499 if (bytesToRead > 0) { 08500 DRFLAC_COPY_MEMORY(bufferOut, memoryStream->data + memoryStream->currentReadPos, bytesToRead); 08501 memoryStream->currentReadPos += bytesToRead; 08502 } 08503 08504 return bytesToRead; 08505 } 08506 08507 static drflac_bool32 drflac__on_seek_memory(void* pUserData, int offset, drflac_seek_origin origin) 08508 { 08509 drflac__memory_stream* memoryStream = (drflac__memory_stream*)pUserData; 08510 08511 DRFLAC_ASSERT(memoryStream != NULL); 08512 DRFLAC_ASSERT(offset >= 0); /* <-- Never seek backwards. */ 08513 08514 if (offset > (drflac_int64)memoryStream->dataSize) { 08515 return DRFLAC_FALSE; 08516 } 08517 08518 if (origin == drflac_seek_origin_current) { 08519 if (memoryStream->currentReadPos + offset <= memoryStream->dataSize) { 08520 memoryStream->currentReadPos += offset; 08521 } else { 08522 return DRFLAC_FALSE; /* Trying to seek too far forward. */ 08523 } 08524 } else { 08525 if ((drflac_uint32)offset <= memoryStream->dataSize) { 08526 memoryStream->currentReadPos = offset; 08527 } else { 08528 return DRFLAC_FALSE; /* Trying to seek too far forward. */ 08529 } 08530 } 08531 08532 return DRFLAC_TRUE; 08533 } 08534 08535 DRFLAC_API drflac* drflac_open_memory(const void* pData, size_t dataSize, const drflac_allocation_callbacks* pAllocationCallbacks) 08536 { 08537 drflac__memory_stream memoryStream; 08538 drflac* pFlac; 08539 08540 memoryStream.data = (const drflac_uint8*)pData; 08541 memoryStream.dataSize = dataSize; 08542 memoryStream.currentReadPos = 0; 08543 pFlac = drflac_open(drflac__on_read_memory, drflac__on_seek_memory, &memoryStream, pAllocationCallbacks); 08544 if (pFlac == NULL) { 08545 return NULL; 08546 } 08547 08548 pFlac->memoryStream = memoryStream; 08549 08550 /* This is an awful hack... */ 08551 #ifndef DR_FLAC_NO_OGG 08552 if (pFlac->container == drflac_container_ogg) 08553 { 08554 drflac_oggbs* oggbs = (drflac_oggbs*)pFlac->_oggbs; 08555 oggbs->pUserData = &pFlac->memoryStream; 08556 } 08557 else 08558 #endif 08559 { 08560 pFlac->bs.pUserData = &pFlac->memoryStream; 08561 } 08562 08563 return pFlac; 08564 } 08565 08566 DRFLAC_API drflac* drflac_open_memory_with_metadata(const void* pData, size_t dataSize, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks) 08567 { 08568 drflac__memory_stream memoryStream; 08569 drflac* pFlac; 08570 08571 memoryStream.data = (const drflac_uint8*)pData; 08572 memoryStream.dataSize = dataSize; 08573 memoryStream.currentReadPos = 0; 08574 pFlac = drflac_open_with_metadata_private(drflac__on_read_memory, drflac__on_seek_memory, onMeta, drflac_container_unknown, &memoryStream, pUserData, pAllocationCallbacks); 08575 if (pFlac == NULL) { 08576 return NULL; 08577 } 08578 08579 pFlac->memoryStream = memoryStream; 08580 08581 /* This is an awful hack... */ 08582 #ifndef DR_FLAC_NO_OGG 08583 if (pFlac->container == drflac_container_ogg) 08584 { 08585 drflac_oggbs* oggbs = (drflac_oggbs*)pFlac->_oggbs; 08586 oggbs->pUserData = &pFlac->memoryStream; 08587 } 08588 else 08589 #endif 08590 { 08591 pFlac->bs.pUserData = &pFlac->memoryStream; 08592 } 08593 08594 return pFlac; 08595 } 08596 08597 08598 08599 DRFLAC_API drflac* drflac_open(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks) 08600 { 08601 return drflac_open_with_metadata_private(onRead, onSeek, NULL, drflac_container_unknown, pUserData, pUserData, pAllocationCallbacks); 08602 } 08603 DRFLAC_API drflac* drflac_open_relaxed(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_container container, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks) 08604 { 08605 return drflac_open_with_metadata_private(onRead, onSeek, NULL, container, pUserData, pUserData, pAllocationCallbacks); 08606 } 08607 08608 DRFLAC_API drflac* drflac_open_with_metadata(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks) 08609 { 08610 return drflac_open_with_metadata_private(onRead, onSeek, onMeta, drflac_container_unknown, pUserData, pUserData, pAllocationCallbacks); 08611 } 08612 DRFLAC_API drflac* drflac_open_with_metadata_relaxed(drflac_read_proc onRead, drflac_seek_proc onSeek, drflac_meta_proc onMeta, drflac_container container, void* pUserData, const drflac_allocation_callbacks* pAllocationCallbacks) 08613 { 08614 return drflac_open_with_metadata_private(onRead, onSeek, onMeta, container, pUserData, pUserData, pAllocationCallbacks); 08615 } 08616 08617 DRFLAC_API void drflac_close(drflac* pFlac) 08618 { 08619 if (pFlac == NULL) { 08620 return; 08621 } 08622 08623 #ifndef DR_FLAC_NO_STDIO 08624 /* 08625 If we opened the file with drflac_open_file() we will want to close the file handle. We can know whether or not drflac_open_file() 08626 was used by looking at the callbacks. 08627 */ 08628 if (pFlac->bs.onRead == drflac__on_read_stdio) { 08629 fclose((FILE*)pFlac->bs.pUserData); 08630 } 08631 08632 #ifndef DR_FLAC_NO_OGG 08633 /* Need to clean up Ogg streams a bit differently due to the way the bit streaming is chained. */ 08634 if (pFlac->container == drflac_container_ogg) { 08635 drflac_oggbs* oggbs = (drflac_oggbs*)pFlac->_oggbs; 08636 DRFLAC_ASSERT(pFlac->bs.onRead == drflac__on_read_ogg); 08637 08638 if (oggbs->onRead == drflac__on_read_stdio) { 08639 fclose((FILE*)oggbs->pUserData); 08640 } 08641 } 08642 #endif 08643 #endif 08644 08645 drflac__free_from_callbacks(pFlac, &pFlac->allocationCallbacks); 08646 } 08647 08648 08649 #if 0 08650 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 08651 { 08652 drflac_uint64 i; 08653 for (i = 0; i < frameCount; ++i) { 08654 drflac_uint32 left = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 08655 drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 08656 drflac_uint32 right = left - side; 08657 08658 pOutputSamples[i*2+0] = (drflac_int32)left; 08659 pOutputSamples[i*2+1] = (drflac_int32)right; 08660 } 08661 } 08662 #endif 08663 08664 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 08665 { 08666 drflac_uint64 i; 08667 drflac_uint64 frameCount4 = frameCount >> 2; 08668 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 08669 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 08670 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 08671 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 08672 08673 for (i = 0; i < frameCount4; ++i) { 08674 drflac_uint32 left0 = pInputSamples0U32[i*4+0] << shift0; 08675 drflac_uint32 left1 = pInputSamples0U32[i*4+1] << shift0; 08676 drflac_uint32 left2 = pInputSamples0U32[i*4+2] << shift0; 08677 drflac_uint32 left3 = pInputSamples0U32[i*4+3] << shift0; 08678 08679 drflac_uint32 side0 = pInputSamples1U32[i*4+0] << shift1; 08680 drflac_uint32 side1 = pInputSamples1U32[i*4+1] << shift1; 08681 drflac_uint32 side2 = pInputSamples1U32[i*4+2] << shift1; 08682 drflac_uint32 side3 = pInputSamples1U32[i*4+3] << shift1; 08683 08684 drflac_uint32 right0 = left0 - side0; 08685 drflac_uint32 right1 = left1 - side1; 08686 drflac_uint32 right2 = left2 - side2; 08687 drflac_uint32 right3 = left3 - side3; 08688 08689 pOutputSamples[i*8+0] = (drflac_int32)left0; 08690 pOutputSamples[i*8+1] = (drflac_int32)right0; 08691 pOutputSamples[i*8+2] = (drflac_int32)left1; 08692 pOutputSamples[i*8+3] = (drflac_int32)right1; 08693 pOutputSamples[i*8+4] = (drflac_int32)left2; 08694 pOutputSamples[i*8+5] = (drflac_int32)right2; 08695 pOutputSamples[i*8+6] = (drflac_int32)left3; 08696 pOutputSamples[i*8+7] = (drflac_int32)right3; 08697 } 08698 08699 for (i = (frameCount4 << 2); i < frameCount; ++i) { 08700 drflac_uint32 left = pInputSamples0U32[i] << shift0; 08701 drflac_uint32 side = pInputSamples1U32[i] << shift1; 08702 drflac_uint32 right = left - side; 08703 08704 pOutputSamples[i*2+0] = (drflac_int32)left; 08705 pOutputSamples[i*2+1] = (drflac_int32)right; 08706 } 08707 } 08708 08709 #if defined(DRFLAC_SUPPORT_SSE2) 08710 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 08711 { 08712 drflac_uint64 i; 08713 drflac_uint64 frameCount4 = frameCount >> 2; 08714 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 08715 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 08716 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 08717 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 08718 08719 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 08720 08721 for (i = 0; i < frameCount4; ++i) { 08722 __m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0); 08723 __m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1); 08724 __m128i right = _mm_sub_epi32(left, side); 08725 08726 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right)); 08727 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right)); 08728 } 08729 08730 for (i = (frameCount4 << 2); i < frameCount; ++i) { 08731 drflac_uint32 left = pInputSamples0U32[i] << shift0; 08732 drflac_uint32 side = pInputSamples1U32[i] << shift1; 08733 drflac_uint32 right = left - side; 08734 08735 pOutputSamples[i*2+0] = (drflac_int32)left; 08736 pOutputSamples[i*2+1] = (drflac_int32)right; 08737 } 08738 } 08739 #endif 08740 08741 #if defined(DRFLAC_SUPPORT_NEON) 08742 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 08743 { 08744 drflac_uint64 i; 08745 drflac_uint64 frameCount4 = frameCount >> 2; 08746 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 08747 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 08748 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 08749 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 08750 int32x4_t shift0_4; 08751 int32x4_t shift1_4; 08752 08753 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 08754 08755 shift0_4 = vdupq_n_s32(shift0); 08756 shift1_4 = vdupq_n_s32(shift1); 08757 08758 for (i = 0; i < frameCount4; ++i) { 08759 uint32x4_t left; 08760 uint32x4_t side; 08761 uint32x4_t right; 08762 08763 left = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4); 08764 side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4); 08765 right = vsubq_u32(left, side); 08766 08767 drflac__vst2q_u32((drflac_uint32*)pOutputSamples + i*8, vzipq_u32(left, right)); 08768 } 08769 08770 for (i = (frameCount4 << 2); i < frameCount; ++i) { 08771 drflac_uint32 left = pInputSamples0U32[i] << shift0; 08772 drflac_uint32 side = pInputSamples1U32[i] << shift1; 08773 drflac_uint32 right = left - side; 08774 08775 pOutputSamples[i*2+0] = (drflac_int32)left; 08776 pOutputSamples[i*2+1] = (drflac_int32)right; 08777 } 08778 } 08779 #endif 08780 08781 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_left_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 08782 { 08783 #if defined(DRFLAC_SUPPORT_SSE2) 08784 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 08785 drflac_read_pcm_frames_s32__decode_left_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 08786 } else 08787 #elif defined(DRFLAC_SUPPORT_NEON) 08788 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 08789 drflac_read_pcm_frames_s32__decode_left_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 08790 } else 08791 #endif 08792 { 08793 /* Scalar fallback. */ 08794 #if 0 08795 drflac_read_pcm_frames_s32__decode_left_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 08796 #else 08797 drflac_read_pcm_frames_s32__decode_left_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 08798 #endif 08799 } 08800 } 08801 08802 08803 #if 0 08804 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 08805 { 08806 drflac_uint64 i; 08807 for (i = 0; i < frameCount; ++i) { 08808 drflac_uint32 side = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 08809 drflac_uint32 right = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 08810 drflac_uint32 left = right + side; 08811 08812 pOutputSamples[i*2+0] = (drflac_int32)left; 08813 pOutputSamples[i*2+1] = (drflac_int32)right; 08814 } 08815 } 08816 #endif 08817 08818 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 08819 { 08820 drflac_uint64 i; 08821 drflac_uint64 frameCount4 = frameCount >> 2; 08822 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 08823 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 08824 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 08825 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 08826 08827 for (i = 0; i < frameCount4; ++i) { 08828 drflac_uint32 side0 = pInputSamples0U32[i*4+0] << shift0; 08829 drflac_uint32 side1 = pInputSamples0U32[i*4+1] << shift0; 08830 drflac_uint32 side2 = pInputSamples0U32[i*4+2] << shift0; 08831 drflac_uint32 side3 = pInputSamples0U32[i*4+3] << shift0; 08832 08833 drflac_uint32 right0 = pInputSamples1U32[i*4+0] << shift1; 08834 drflac_uint32 right1 = pInputSamples1U32[i*4+1] << shift1; 08835 drflac_uint32 right2 = pInputSamples1U32[i*4+2] << shift1; 08836 drflac_uint32 right3 = pInputSamples1U32[i*4+3] << shift1; 08837 08838 drflac_uint32 left0 = right0 + side0; 08839 drflac_uint32 left1 = right1 + side1; 08840 drflac_uint32 left2 = right2 + side2; 08841 drflac_uint32 left3 = right3 + side3; 08842 08843 pOutputSamples[i*8+0] = (drflac_int32)left0; 08844 pOutputSamples[i*8+1] = (drflac_int32)right0; 08845 pOutputSamples[i*8+2] = (drflac_int32)left1; 08846 pOutputSamples[i*8+3] = (drflac_int32)right1; 08847 pOutputSamples[i*8+4] = (drflac_int32)left2; 08848 pOutputSamples[i*8+5] = (drflac_int32)right2; 08849 pOutputSamples[i*8+6] = (drflac_int32)left3; 08850 pOutputSamples[i*8+7] = (drflac_int32)right3; 08851 } 08852 08853 for (i = (frameCount4 << 2); i < frameCount; ++i) { 08854 drflac_uint32 side = pInputSamples0U32[i] << shift0; 08855 drflac_uint32 right = pInputSamples1U32[i] << shift1; 08856 drflac_uint32 left = right + side; 08857 08858 pOutputSamples[i*2+0] = (drflac_int32)left; 08859 pOutputSamples[i*2+1] = (drflac_int32)right; 08860 } 08861 } 08862 08863 #if defined(DRFLAC_SUPPORT_SSE2) 08864 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 08865 { 08866 drflac_uint64 i; 08867 drflac_uint64 frameCount4 = frameCount >> 2; 08868 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 08869 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 08870 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 08871 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 08872 08873 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 08874 08875 for (i = 0; i < frameCount4; ++i) { 08876 __m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0); 08877 __m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1); 08878 __m128i left = _mm_add_epi32(right, side); 08879 08880 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right)); 08881 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right)); 08882 } 08883 08884 for (i = (frameCount4 << 2); i < frameCount; ++i) { 08885 drflac_uint32 side = pInputSamples0U32[i] << shift0; 08886 drflac_uint32 right = pInputSamples1U32[i] << shift1; 08887 drflac_uint32 left = right + side; 08888 08889 pOutputSamples[i*2+0] = (drflac_int32)left; 08890 pOutputSamples[i*2+1] = (drflac_int32)right; 08891 } 08892 } 08893 #endif 08894 08895 #if defined(DRFLAC_SUPPORT_NEON) 08896 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 08897 { 08898 drflac_uint64 i; 08899 drflac_uint64 frameCount4 = frameCount >> 2; 08900 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 08901 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 08902 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 08903 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 08904 int32x4_t shift0_4; 08905 int32x4_t shift1_4; 08906 08907 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 08908 08909 shift0_4 = vdupq_n_s32(shift0); 08910 shift1_4 = vdupq_n_s32(shift1); 08911 08912 for (i = 0; i < frameCount4; ++i) { 08913 uint32x4_t side; 08914 uint32x4_t right; 08915 uint32x4_t left; 08916 08917 side = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4); 08918 right = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4); 08919 left = vaddq_u32(right, side); 08920 08921 drflac__vst2q_u32((drflac_uint32*)pOutputSamples + i*8, vzipq_u32(left, right)); 08922 } 08923 08924 for (i = (frameCount4 << 2); i < frameCount; ++i) { 08925 drflac_uint32 side = pInputSamples0U32[i] << shift0; 08926 drflac_uint32 right = pInputSamples1U32[i] << shift1; 08927 drflac_uint32 left = right + side; 08928 08929 pOutputSamples[i*2+0] = (drflac_int32)left; 08930 pOutputSamples[i*2+1] = (drflac_int32)right; 08931 } 08932 } 08933 #endif 08934 08935 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_right_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 08936 { 08937 #if defined(DRFLAC_SUPPORT_SSE2) 08938 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 08939 drflac_read_pcm_frames_s32__decode_right_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 08940 } else 08941 #elif defined(DRFLAC_SUPPORT_NEON) 08942 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 08943 drflac_read_pcm_frames_s32__decode_right_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 08944 } else 08945 #endif 08946 { 08947 /* Scalar fallback. */ 08948 #if 0 08949 drflac_read_pcm_frames_s32__decode_right_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 08950 #else 08951 drflac_read_pcm_frames_s32__decode_right_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 08952 #endif 08953 } 08954 } 08955 08956 08957 #if 0 08958 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 08959 { 08960 for (drflac_uint64 i = 0; i < frameCount; ++i) { 08961 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 08962 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 08963 08964 mid = (mid << 1) | (side & 0x01); 08965 08966 pOutputSamples[i*2+0] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample); 08967 pOutputSamples[i*2+1] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample); 08968 } 08969 } 08970 #endif 08971 08972 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 08973 { 08974 drflac_uint64 i; 08975 drflac_uint64 frameCount4 = frameCount >> 2; 08976 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 08977 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 08978 drflac_int32 shift = unusedBitsPerSample; 08979 08980 if (shift > 0) { 08981 shift -= 1; 08982 for (i = 0; i < frameCount4; ++i) { 08983 drflac_uint32 temp0L; 08984 drflac_uint32 temp1L; 08985 drflac_uint32 temp2L; 08986 drflac_uint32 temp3L; 08987 drflac_uint32 temp0R; 08988 drflac_uint32 temp1R; 08989 drflac_uint32 temp2R; 08990 drflac_uint32 temp3R; 08991 08992 drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 08993 drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 08994 drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 08995 drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 08996 08997 drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 08998 drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 08999 drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09000 drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09001 09002 mid0 = (mid0 << 1) | (side0 & 0x01); 09003 mid1 = (mid1 << 1) | (side1 & 0x01); 09004 mid2 = (mid2 << 1) | (side2 & 0x01); 09005 mid3 = (mid3 << 1) | (side3 & 0x01); 09006 09007 temp0L = (mid0 + side0) << shift; 09008 temp1L = (mid1 + side1) << shift; 09009 temp2L = (mid2 + side2) << shift; 09010 temp3L = (mid3 + side3) << shift; 09011 09012 temp0R = (mid0 - side0) << shift; 09013 temp1R = (mid1 - side1) << shift; 09014 temp2R = (mid2 - side2) << shift; 09015 temp3R = (mid3 - side3) << shift; 09016 09017 pOutputSamples[i*8+0] = (drflac_int32)temp0L; 09018 pOutputSamples[i*8+1] = (drflac_int32)temp0R; 09019 pOutputSamples[i*8+2] = (drflac_int32)temp1L; 09020 pOutputSamples[i*8+3] = (drflac_int32)temp1R; 09021 pOutputSamples[i*8+4] = (drflac_int32)temp2L; 09022 pOutputSamples[i*8+5] = (drflac_int32)temp2R; 09023 pOutputSamples[i*8+6] = (drflac_int32)temp3L; 09024 pOutputSamples[i*8+7] = (drflac_int32)temp3R; 09025 } 09026 } else { 09027 for (i = 0; i < frameCount4; ++i) { 09028 drflac_uint32 temp0L; 09029 drflac_uint32 temp1L; 09030 drflac_uint32 temp2L; 09031 drflac_uint32 temp3L; 09032 drflac_uint32 temp0R; 09033 drflac_uint32 temp1R; 09034 drflac_uint32 temp2R; 09035 drflac_uint32 temp3R; 09036 09037 drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09038 drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09039 drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09040 drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09041 09042 drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09043 drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09044 drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09045 drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09046 09047 mid0 = (mid0 << 1) | (side0 & 0x01); 09048 mid1 = (mid1 << 1) | (side1 & 0x01); 09049 mid2 = (mid2 << 1) | (side2 & 0x01); 09050 mid3 = (mid3 << 1) | (side3 & 0x01); 09051 09052 temp0L = (drflac_uint32)((drflac_int32)(mid0 + side0) >> 1); 09053 temp1L = (drflac_uint32)((drflac_int32)(mid1 + side1) >> 1); 09054 temp2L = (drflac_uint32)((drflac_int32)(mid2 + side2) >> 1); 09055 temp3L = (drflac_uint32)((drflac_int32)(mid3 + side3) >> 1); 09056 09057 temp0R = (drflac_uint32)((drflac_int32)(mid0 - side0) >> 1); 09058 temp1R = (drflac_uint32)((drflac_int32)(mid1 - side1) >> 1); 09059 temp2R = (drflac_uint32)((drflac_int32)(mid2 - side2) >> 1); 09060 temp3R = (drflac_uint32)((drflac_int32)(mid3 - side3) >> 1); 09061 09062 pOutputSamples[i*8+0] = (drflac_int32)temp0L; 09063 pOutputSamples[i*8+1] = (drflac_int32)temp0R; 09064 pOutputSamples[i*8+2] = (drflac_int32)temp1L; 09065 pOutputSamples[i*8+3] = (drflac_int32)temp1R; 09066 pOutputSamples[i*8+4] = (drflac_int32)temp2L; 09067 pOutputSamples[i*8+5] = (drflac_int32)temp2R; 09068 pOutputSamples[i*8+6] = (drflac_int32)temp3L; 09069 pOutputSamples[i*8+7] = (drflac_int32)temp3R; 09070 } 09071 } 09072 09073 for (i = (frameCount4 << 2); i < frameCount; ++i) { 09074 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09075 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09076 09077 mid = (mid << 1) | (side & 0x01); 09078 09079 pOutputSamples[i*2+0] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample); 09080 pOutputSamples[i*2+1] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample); 09081 } 09082 } 09083 09084 #if defined(DRFLAC_SUPPORT_SSE2) 09085 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 09086 { 09087 drflac_uint64 i; 09088 drflac_uint64 frameCount4 = frameCount >> 2; 09089 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 09090 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 09091 drflac_int32 shift = unusedBitsPerSample; 09092 09093 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 09094 09095 if (shift == 0) { 09096 for (i = 0; i < frameCount4; ++i) { 09097 __m128i mid; 09098 __m128i side; 09099 __m128i left; 09100 __m128i right; 09101 09102 mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 09103 side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 09104 09105 mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01))); 09106 09107 left = _mm_srai_epi32(_mm_add_epi32(mid, side), 1); 09108 right = _mm_srai_epi32(_mm_sub_epi32(mid, side), 1); 09109 09110 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right)); 09111 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right)); 09112 } 09113 09114 for (i = (frameCount4 << 2); i < frameCount; ++i) { 09115 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09116 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09117 09118 mid = (mid << 1) | (side & 0x01); 09119 09120 pOutputSamples[i*2+0] = (drflac_int32)(mid + side) >> 1; 09121 pOutputSamples[i*2+1] = (drflac_int32)(mid - side) >> 1; 09122 } 09123 } else { 09124 shift -= 1; 09125 for (i = 0; i < frameCount4; ++i) { 09126 __m128i mid; 09127 __m128i side; 09128 __m128i left; 09129 __m128i right; 09130 09131 mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 09132 side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 09133 09134 mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01))); 09135 09136 left = _mm_slli_epi32(_mm_add_epi32(mid, side), shift); 09137 right = _mm_slli_epi32(_mm_sub_epi32(mid, side), shift); 09138 09139 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right)); 09140 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right)); 09141 } 09142 09143 for (i = (frameCount4 << 2); i < frameCount; ++i) { 09144 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09145 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09146 09147 mid = (mid << 1) | (side & 0x01); 09148 09149 pOutputSamples[i*2+0] = (drflac_int32)((mid + side) << shift); 09150 pOutputSamples[i*2+1] = (drflac_int32)((mid - side) << shift); 09151 } 09152 } 09153 } 09154 #endif 09155 09156 #if defined(DRFLAC_SUPPORT_NEON) 09157 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 09158 { 09159 drflac_uint64 i; 09160 drflac_uint64 frameCount4 = frameCount >> 2; 09161 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 09162 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 09163 drflac_int32 shift = unusedBitsPerSample; 09164 int32x4_t wbpsShift0_4; /* wbps = Wasted Bits Per Sample */ 09165 int32x4_t wbpsShift1_4; /* wbps = Wasted Bits Per Sample */ 09166 uint32x4_t one4; 09167 09168 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 09169 09170 wbpsShift0_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 09171 wbpsShift1_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 09172 one4 = vdupq_n_u32(1); 09173 09174 if (shift == 0) { 09175 for (i = 0; i < frameCount4; ++i) { 09176 uint32x4_t mid; 09177 uint32x4_t side; 09178 int32x4_t left; 09179 int32x4_t right; 09180 09181 mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbpsShift0_4); 09182 side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbpsShift1_4); 09183 09184 mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, one4)); 09185 09186 left = vshrq_n_s32(vreinterpretq_s32_u32(vaddq_u32(mid, side)), 1); 09187 right = vshrq_n_s32(vreinterpretq_s32_u32(vsubq_u32(mid, side)), 1); 09188 09189 drflac__vst2q_s32(pOutputSamples + i*8, vzipq_s32(left, right)); 09190 } 09191 09192 for (i = (frameCount4 << 2); i < frameCount; ++i) { 09193 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09194 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09195 09196 mid = (mid << 1) | (side & 0x01); 09197 09198 pOutputSamples[i*2+0] = (drflac_int32)(mid + side) >> 1; 09199 pOutputSamples[i*2+1] = (drflac_int32)(mid - side) >> 1; 09200 } 09201 } else { 09202 int32x4_t shift4; 09203 09204 shift -= 1; 09205 shift4 = vdupq_n_s32(shift); 09206 09207 for (i = 0; i < frameCount4; ++i) { 09208 uint32x4_t mid; 09209 uint32x4_t side; 09210 int32x4_t left; 09211 int32x4_t right; 09212 09213 mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbpsShift0_4); 09214 side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbpsShift1_4); 09215 09216 mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, one4)); 09217 09218 left = vreinterpretq_s32_u32(vshlq_u32(vaddq_u32(mid, side), shift4)); 09219 right = vreinterpretq_s32_u32(vshlq_u32(vsubq_u32(mid, side), shift4)); 09220 09221 drflac__vst2q_s32(pOutputSamples + i*8, vzipq_s32(left, right)); 09222 } 09223 09224 for (i = (frameCount4 << 2); i < frameCount; ++i) { 09225 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09226 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09227 09228 mid = (mid << 1) | (side & 0x01); 09229 09230 pOutputSamples[i*2+0] = (drflac_int32)((mid + side) << shift); 09231 pOutputSamples[i*2+1] = (drflac_int32)((mid - side) << shift); 09232 } 09233 } 09234 } 09235 #endif 09236 09237 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_mid_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 09238 { 09239 #if defined(DRFLAC_SUPPORT_SSE2) 09240 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 09241 drflac_read_pcm_frames_s32__decode_mid_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 09242 } else 09243 #elif defined(DRFLAC_SUPPORT_NEON) 09244 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 09245 drflac_read_pcm_frames_s32__decode_mid_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 09246 } else 09247 #endif 09248 { 09249 /* Scalar fallback. */ 09250 #if 0 09251 drflac_read_pcm_frames_s32__decode_mid_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 09252 #else 09253 drflac_read_pcm_frames_s32__decode_mid_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 09254 #endif 09255 } 09256 } 09257 09258 09259 #if 0 09260 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 09261 { 09262 for (drflac_uint64 i = 0; i < frameCount; ++i) { 09263 pOutputSamples[i*2+0] = (drflac_int32)((drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample)); 09264 pOutputSamples[i*2+1] = (drflac_int32)((drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample)); 09265 } 09266 } 09267 #endif 09268 09269 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 09270 { 09271 drflac_uint64 i; 09272 drflac_uint64 frameCount4 = frameCount >> 2; 09273 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 09274 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 09275 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09276 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09277 09278 for (i = 0; i < frameCount4; ++i) { 09279 drflac_uint32 tempL0 = pInputSamples0U32[i*4+0] << shift0; 09280 drflac_uint32 tempL1 = pInputSamples0U32[i*4+1] << shift0; 09281 drflac_uint32 tempL2 = pInputSamples0U32[i*4+2] << shift0; 09282 drflac_uint32 tempL3 = pInputSamples0U32[i*4+3] << shift0; 09283 09284 drflac_uint32 tempR0 = pInputSamples1U32[i*4+0] << shift1; 09285 drflac_uint32 tempR1 = pInputSamples1U32[i*4+1] << shift1; 09286 drflac_uint32 tempR2 = pInputSamples1U32[i*4+2] << shift1; 09287 drflac_uint32 tempR3 = pInputSamples1U32[i*4+3] << shift1; 09288 09289 pOutputSamples[i*8+0] = (drflac_int32)tempL0; 09290 pOutputSamples[i*8+1] = (drflac_int32)tempR0; 09291 pOutputSamples[i*8+2] = (drflac_int32)tempL1; 09292 pOutputSamples[i*8+3] = (drflac_int32)tempR1; 09293 pOutputSamples[i*8+4] = (drflac_int32)tempL2; 09294 pOutputSamples[i*8+5] = (drflac_int32)tempR2; 09295 pOutputSamples[i*8+6] = (drflac_int32)tempL3; 09296 pOutputSamples[i*8+7] = (drflac_int32)tempR3; 09297 } 09298 09299 for (i = (frameCount4 << 2); i < frameCount; ++i) { 09300 pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0); 09301 pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1); 09302 } 09303 } 09304 09305 #if defined(DRFLAC_SUPPORT_SSE2) 09306 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 09307 { 09308 drflac_uint64 i; 09309 drflac_uint64 frameCount4 = frameCount >> 2; 09310 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 09311 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 09312 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09313 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09314 09315 for (i = 0; i < frameCount4; ++i) { 09316 __m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0); 09317 __m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1); 09318 09319 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 0), _mm_unpacklo_epi32(left, right)); 09320 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8 + 4), _mm_unpackhi_epi32(left, right)); 09321 } 09322 09323 for (i = (frameCount4 << 2); i < frameCount; ++i) { 09324 pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0); 09325 pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1); 09326 } 09327 } 09328 #endif 09329 09330 #if defined(DRFLAC_SUPPORT_NEON) 09331 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 09332 { 09333 drflac_uint64 i; 09334 drflac_uint64 frameCount4 = frameCount >> 2; 09335 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 09336 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 09337 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09338 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09339 09340 int32x4_t shift4_0 = vdupq_n_s32(shift0); 09341 int32x4_t shift4_1 = vdupq_n_s32(shift1); 09342 09343 for (i = 0; i < frameCount4; ++i) { 09344 int32x4_t left; 09345 int32x4_t right; 09346 09347 left = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift4_0)); 09348 right = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift4_1)); 09349 09350 drflac__vst2q_s32(pOutputSamples + i*8, vzipq_s32(left, right)); 09351 } 09352 09353 for (i = (frameCount4 << 2); i < frameCount; ++i) { 09354 pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0); 09355 pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1); 09356 } 09357 } 09358 #endif 09359 09360 static DRFLAC_INLINE void drflac_read_pcm_frames_s32__decode_independent_stereo(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int32* pOutputSamples) 09361 { 09362 #if defined(DRFLAC_SUPPORT_SSE2) 09363 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 09364 drflac_read_pcm_frames_s32__decode_independent_stereo__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 09365 } else 09366 #elif defined(DRFLAC_SUPPORT_NEON) 09367 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 09368 drflac_read_pcm_frames_s32__decode_independent_stereo__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 09369 } else 09370 #endif 09371 { 09372 /* Scalar fallback. */ 09373 #if 0 09374 drflac_read_pcm_frames_s32__decode_independent_stereo__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 09375 #else 09376 drflac_read_pcm_frames_s32__decode_independent_stereo__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 09377 #endif 09378 } 09379 } 09380 09381 09382 DRFLAC_API drflac_uint64 drflac_read_pcm_frames_s32(drflac* pFlac, drflac_uint64 framesToRead, drflac_int32* pBufferOut) 09383 { 09384 drflac_uint64 framesRead; 09385 drflac_uint32 unusedBitsPerSample; 09386 09387 if (pFlac == NULL || framesToRead == 0) { 09388 return 0; 09389 } 09390 09391 if (pBufferOut == NULL) { 09392 return drflac__seek_forward_by_pcm_frames(pFlac, framesToRead); 09393 } 09394 09395 DRFLAC_ASSERT(pFlac->bitsPerSample <= 32); 09396 unusedBitsPerSample = 32 - pFlac->bitsPerSample; 09397 09398 framesRead = 0; 09399 while (framesToRead > 0) { 09400 /* If we've run out of samples in this frame, go to the next. */ 09401 if (pFlac->currentFLACFrame.pcmFramesRemaining == 0) { 09402 if (!drflac__read_and_decode_next_flac_frame(pFlac)) { 09403 break; /* Couldn't read the next frame, so just break from the loop and return. */ 09404 } 09405 } else { 09406 unsigned int channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment); 09407 drflac_uint64 iFirstPCMFrame = pFlac->currentFLACFrame.header.blockSizeInPCMFrames - pFlac->currentFLACFrame.pcmFramesRemaining; 09408 drflac_uint64 frameCountThisIteration = framesToRead; 09409 09410 if (frameCountThisIteration > pFlac->currentFLACFrame.pcmFramesRemaining) { 09411 frameCountThisIteration = pFlac->currentFLACFrame.pcmFramesRemaining; 09412 } 09413 09414 if (channelCount == 2) { 09415 const drflac_int32* pDecodedSamples0 = pFlac->currentFLACFrame.subframes[0].pSamplesS32 + iFirstPCMFrame; 09416 const drflac_int32* pDecodedSamples1 = pFlac->currentFLACFrame.subframes[1].pSamplesS32 + iFirstPCMFrame; 09417 09418 switch (pFlac->currentFLACFrame.header.channelAssignment) 09419 { 09420 case DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE: 09421 { 09422 drflac_read_pcm_frames_s32__decode_left_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 09423 } break; 09424 09425 case DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE: 09426 { 09427 drflac_read_pcm_frames_s32__decode_right_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 09428 } break; 09429 09430 case DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE: 09431 { 09432 drflac_read_pcm_frames_s32__decode_mid_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 09433 } break; 09434 09435 case DRFLAC_CHANNEL_ASSIGNMENT_INDEPENDENT: 09436 default: 09437 { 09438 drflac_read_pcm_frames_s32__decode_independent_stereo(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 09439 } break; 09440 } 09441 } else { 09442 /* Generic interleaving. */ 09443 drflac_uint64 i; 09444 for (i = 0; i < frameCountThisIteration; ++i) { 09445 unsigned int j; 09446 for (j = 0; j < channelCount; ++j) { 09447 pBufferOut[(i*channelCount)+j] = (drflac_int32)((drflac_uint32)(pFlac->currentFLACFrame.subframes[j].pSamplesS32[iFirstPCMFrame + i]) << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[j].wastedBitsPerSample)); 09448 } 09449 } 09450 } 09451 09452 framesRead += frameCountThisIteration; 09453 pBufferOut += frameCountThisIteration * channelCount; 09454 framesToRead -= frameCountThisIteration; 09455 pFlac->currentPCMFrame += frameCountThisIteration; 09456 pFlac->currentFLACFrame.pcmFramesRemaining -= (drflac_uint32)frameCountThisIteration; 09457 } 09458 } 09459 09460 return framesRead; 09461 } 09462 09463 09464 #if 0 09465 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 09466 { 09467 drflac_uint64 i; 09468 for (i = 0; i < frameCount; ++i) { 09469 drflac_uint32 left = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 09470 drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 09471 drflac_uint32 right = left - side; 09472 09473 left >>= 16; 09474 right >>= 16; 09475 09476 pOutputSamples[i*2+0] = (drflac_int16)left; 09477 pOutputSamples[i*2+1] = (drflac_int16)right; 09478 } 09479 } 09480 #endif 09481 09482 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 09483 { 09484 drflac_uint64 i; 09485 drflac_uint64 frameCount4 = frameCount >> 2; 09486 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 09487 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 09488 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09489 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09490 09491 for (i = 0; i < frameCount4; ++i) { 09492 drflac_uint32 left0 = pInputSamples0U32[i*4+0] << shift0; 09493 drflac_uint32 left1 = pInputSamples0U32[i*4+1] << shift0; 09494 drflac_uint32 left2 = pInputSamples0U32[i*4+2] << shift0; 09495 drflac_uint32 left3 = pInputSamples0U32[i*4+3] << shift0; 09496 09497 drflac_uint32 side0 = pInputSamples1U32[i*4+0] << shift1; 09498 drflac_uint32 side1 = pInputSamples1U32[i*4+1] << shift1; 09499 drflac_uint32 side2 = pInputSamples1U32[i*4+2] << shift1; 09500 drflac_uint32 side3 = pInputSamples1U32[i*4+3] << shift1; 09501 09502 drflac_uint32 right0 = left0 - side0; 09503 drflac_uint32 right1 = left1 - side1; 09504 drflac_uint32 right2 = left2 - side2; 09505 drflac_uint32 right3 = left3 - side3; 09506 09507 left0 >>= 16; 09508 left1 >>= 16; 09509 left2 >>= 16; 09510 left3 >>= 16; 09511 09512 right0 >>= 16; 09513 right1 >>= 16; 09514 right2 >>= 16; 09515 right3 >>= 16; 09516 09517 pOutputSamples[i*8+0] = (drflac_int16)left0; 09518 pOutputSamples[i*8+1] = (drflac_int16)right0; 09519 pOutputSamples[i*8+2] = (drflac_int16)left1; 09520 pOutputSamples[i*8+3] = (drflac_int16)right1; 09521 pOutputSamples[i*8+4] = (drflac_int16)left2; 09522 pOutputSamples[i*8+5] = (drflac_int16)right2; 09523 pOutputSamples[i*8+6] = (drflac_int16)left3; 09524 pOutputSamples[i*8+7] = (drflac_int16)right3; 09525 } 09526 09527 for (i = (frameCount4 << 2); i < frameCount; ++i) { 09528 drflac_uint32 left = pInputSamples0U32[i] << shift0; 09529 drflac_uint32 side = pInputSamples1U32[i] << shift1; 09530 drflac_uint32 right = left - side; 09531 09532 left >>= 16; 09533 right >>= 16; 09534 09535 pOutputSamples[i*2+0] = (drflac_int16)left; 09536 pOutputSamples[i*2+1] = (drflac_int16)right; 09537 } 09538 } 09539 09540 #if defined(DRFLAC_SUPPORT_SSE2) 09541 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 09542 { 09543 drflac_uint64 i; 09544 drflac_uint64 frameCount4 = frameCount >> 2; 09545 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 09546 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 09547 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09548 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09549 09550 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 09551 09552 for (i = 0; i < frameCount4; ++i) { 09553 __m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0); 09554 __m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1); 09555 __m128i right = _mm_sub_epi32(left, side); 09556 09557 left = _mm_srai_epi32(left, 16); 09558 right = _mm_srai_epi32(right, 16); 09559 09560 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right)); 09561 } 09562 09563 for (i = (frameCount4 << 2); i < frameCount; ++i) { 09564 drflac_uint32 left = pInputSamples0U32[i] << shift0; 09565 drflac_uint32 side = pInputSamples1U32[i] << shift1; 09566 drflac_uint32 right = left - side; 09567 09568 left >>= 16; 09569 right >>= 16; 09570 09571 pOutputSamples[i*2+0] = (drflac_int16)left; 09572 pOutputSamples[i*2+1] = (drflac_int16)right; 09573 } 09574 } 09575 #endif 09576 09577 #if defined(DRFLAC_SUPPORT_NEON) 09578 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 09579 { 09580 drflac_uint64 i; 09581 drflac_uint64 frameCount4 = frameCount >> 2; 09582 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 09583 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 09584 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09585 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09586 int32x4_t shift0_4; 09587 int32x4_t shift1_4; 09588 09589 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 09590 09591 shift0_4 = vdupq_n_s32(shift0); 09592 shift1_4 = vdupq_n_s32(shift1); 09593 09594 for (i = 0; i < frameCount4; ++i) { 09595 uint32x4_t left; 09596 uint32x4_t side; 09597 uint32x4_t right; 09598 09599 left = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4); 09600 side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4); 09601 right = vsubq_u32(left, side); 09602 09603 left = vshrq_n_u32(left, 16); 09604 right = vshrq_n_u32(right, 16); 09605 09606 drflac__vst2q_u16((drflac_uint16*)pOutputSamples + i*8, vzip_u16(vmovn_u32(left), vmovn_u32(right))); 09607 } 09608 09609 for (i = (frameCount4 << 2); i < frameCount; ++i) { 09610 drflac_uint32 left = pInputSamples0U32[i] << shift0; 09611 drflac_uint32 side = pInputSamples1U32[i] << shift1; 09612 drflac_uint32 right = left - side; 09613 09614 left >>= 16; 09615 right >>= 16; 09616 09617 pOutputSamples[i*2+0] = (drflac_int16)left; 09618 pOutputSamples[i*2+1] = (drflac_int16)right; 09619 } 09620 } 09621 #endif 09622 09623 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_left_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 09624 { 09625 #if defined(DRFLAC_SUPPORT_SSE2) 09626 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 09627 drflac_read_pcm_frames_s16__decode_left_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 09628 } else 09629 #elif defined(DRFLAC_SUPPORT_NEON) 09630 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 09631 drflac_read_pcm_frames_s16__decode_left_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 09632 } else 09633 #endif 09634 { 09635 /* Scalar fallback. */ 09636 #if 0 09637 drflac_read_pcm_frames_s16__decode_left_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 09638 #else 09639 drflac_read_pcm_frames_s16__decode_left_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 09640 #endif 09641 } 09642 } 09643 09644 09645 #if 0 09646 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 09647 { 09648 drflac_uint64 i; 09649 for (i = 0; i < frameCount; ++i) { 09650 drflac_uint32 side = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 09651 drflac_uint32 right = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 09652 drflac_uint32 left = right + side; 09653 09654 left >>= 16; 09655 right >>= 16; 09656 09657 pOutputSamples[i*2+0] = (drflac_int16)left; 09658 pOutputSamples[i*2+1] = (drflac_int16)right; 09659 } 09660 } 09661 #endif 09662 09663 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 09664 { 09665 drflac_uint64 i; 09666 drflac_uint64 frameCount4 = frameCount >> 2; 09667 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 09668 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 09669 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09670 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09671 09672 for (i = 0; i < frameCount4; ++i) { 09673 drflac_uint32 side0 = pInputSamples0U32[i*4+0] << shift0; 09674 drflac_uint32 side1 = pInputSamples0U32[i*4+1] << shift0; 09675 drflac_uint32 side2 = pInputSamples0U32[i*4+2] << shift0; 09676 drflac_uint32 side3 = pInputSamples0U32[i*4+3] << shift0; 09677 09678 drflac_uint32 right0 = pInputSamples1U32[i*4+0] << shift1; 09679 drflac_uint32 right1 = pInputSamples1U32[i*4+1] << shift1; 09680 drflac_uint32 right2 = pInputSamples1U32[i*4+2] << shift1; 09681 drflac_uint32 right3 = pInputSamples1U32[i*4+3] << shift1; 09682 09683 drflac_uint32 left0 = right0 + side0; 09684 drflac_uint32 left1 = right1 + side1; 09685 drflac_uint32 left2 = right2 + side2; 09686 drflac_uint32 left3 = right3 + side3; 09687 09688 left0 >>= 16; 09689 left1 >>= 16; 09690 left2 >>= 16; 09691 left3 >>= 16; 09692 09693 right0 >>= 16; 09694 right1 >>= 16; 09695 right2 >>= 16; 09696 right3 >>= 16; 09697 09698 pOutputSamples[i*8+0] = (drflac_int16)left0; 09699 pOutputSamples[i*8+1] = (drflac_int16)right0; 09700 pOutputSamples[i*8+2] = (drflac_int16)left1; 09701 pOutputSamples[i*8+3] = (drflac_int16)right1; 09702 pOutputSamples[i*8+4] = (drflac_int16)left2; 09703 pOutputSamples[i*8+5] = (drflac_int16)right2; 09704 pOutputSamples[i*8+6] = (drflac_int16)left3; 09705 pOutputSamples[i*8+7] = (drflac_int16)right3; 09706 } 09707 09708 for (i = (frameCount4 << 2); i < frameCount; ++i) { 09709 drflac_uint32 side = pInputSamples0U32[i] << shift0; 09710 drflac_uint32 right = pInputSamples1U32[i] << shift1; 09711 drflac_uint32 left = right + side; 09712 09713 left >>= 16; 09714 right >>= 16; 09715 09716 pOutputSamples[i*2+0] = (drflac_int16)left; 09717 pOutputSamples[i*2+1] = (drflac_int16)right; 09718 } 09719 } 09720 09721 #if defined(DRFLAC_SUPPORT_SSE2) 09722 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 09723 { 09724 drflac_uint64 i; 09725 drflac_uint64 frameCount4 = frameCount >> 2; 09726 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 09727 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 09728 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09729 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09730 09731 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 09732 09733 for (i = 0; i < frameCount4; ++i) { 09734 __m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0); 09735 __m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1); 09736 __m128i left = _mm_add_epi32(right, side); 09737 09738 left = _mm_srai_epi32(left, 16); 09739 right = _mm_srai_epi32(right, 16); 09740 09741 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right)); 09742 } 09743 09744 for (i = (frameCount4 << 2); i < frameCount; ++i) { 09745 drflac_uint32 side = pInputSamples0U32[i] << shift0; 09746 drflac_uint32 right = pInputSamples1U32[i] << shift1; 09747 drflac_uint32 left = right + side; 09748 09749 left >>= 16; 09750 right >>= 16; 09751 09752 pOutputSamples[i*2+0] = (drflac_int16)left; 09753 pOutputSamples[i*2+1] = (drflac_int16)right; 09754 } 09755 } 09756 #endif 09757 09758 #if defined(DRFLAC_SUPPORT_NEON) 09759 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 09760 { 09761 drflac_uint64 i; 09762 drflac_uint64 frameCount4 = frameCount >> 2; 09763 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 09764 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 09765 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09766 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09767 int32x4_t shift0_4; 09768 int32x4_t shift1_4; 09769 09770 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 09771 09772 shift0_4 = vdupq_n_s32(shift0); 09773 shift1_4 = vdupq_n_s32(shift1); 09774 09775 for (i = 0; i < frameCount4; ++i) { 09776 uint32x4_t side; 09777 uint32x4_t right; 09778 uint32x4_t left; 09779 09780 side = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4); 09781 right = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4); 09782 left = vaddq_u32(right, side); 09783 09784 left = vshrq_n_u32(left, 16); 09785 right = vshrq_n_u32(right, 16); 09786 09787 drflac__vst2q_u16((drflac_uint16*)pOutputSamples + i*8, vzip_u16(vmovn_u32(left), vmovn_u32(right))); 09788 } 09789 09790 for (i = (frameCount4 << 2); i < frameCount; ++i) { 09791 drflac_uint32 side = pInputSamples0U32[i] << shift0; 09792 drflac_uint32 right = pInputSamples1U32[i] << shift1; 09793 drflac_uint32 left = right + side; 09794 09795 left >>= 16; 09796 right >>= 16; 09797 09798 pOutputSamples[i*2+0] = (drflac_int16)left; 09799 pOutputSamples[i*2+1] = (drflac_int16)right; 09800 } 09801 } 09802 #endif 09803 09804 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_right_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 09805 { 09806 #if defined(DRFLAC_SUPPORT_SSE2) 09807 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 09808 drflac_read_pcm_frames_s16__decode_right_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 09809 } else 09810 #elif defined(DRFLAC_SUPPORT_NEON) 09811 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 09812 drflac_read_pcm_frames_s16__decode_right_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 09813 } else 09814 #endif 09815 { 09816 /* Scalar fallback. */ 09817 #if 0 09818 drflac_read_pcm_frames_s16__decode_right_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 09819 #else 09820 drflac_read_pcm_frames_s16__decode_right_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 09821 #endif 09822 } 09823 } 09824 09825 09826 #if 0 09827 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 09828 { 09829 for (drflac_uint64 i = 0; i < frameCount; ++i) { 09830 drflac_uint32 mid = (drflac_uint32)pInputSamples0[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09831 drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09832 09833 mid = (mid << 1) | (side & 0x01); 09834 09835 pOutputSamples[i*2+0] = (drflac_int16)(((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample) >> 16); 09836 pOutputSamples[i*2+1] = (drflac_int16)(((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample) >> 16); 09837 } 09838 } 09839 #endif 09840 09841 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 09842 { 09843 drflac_uint64 i; 09844 drflac_uint64 frameCount4 = frameCount >> 2; 09845 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 09846 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 09847 drflac_uint32 shift = unusedBitsPerSample; 09848 09849 if (shift > 0) { 09850 shift -= 1; 09851 for (i = 0; i < frameCount4; ++i) { 09852 drflac_uint32 temp0L; 09853 drflac_uint32 temp1L; 09854 drflac_uint32 temp2L; 09855 drflac_uint32 temp3L; 09856 drflac_uint32 temp0R; 09857 drflac_uint32 temp1R; 09858 drflac_uint32 temp2R; 09859 drflac_uint32 temp3R; 09860 09861 drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09862 drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09863 drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09864 drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09865 09866 drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09867 drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09868 drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09869 drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09870 09871 mid0 = (mid0 << 1) | (side0 & 0x01); 09872 mid1 = (mid1 << 1) | (side1 & 0x01); 09873 mid2 = (mid2 << 1) | (side2 & 0x01); 09874 mid3 = (mid3 << 1) | (side3 & 0x01); 09875 09876 temp0L = (mid0 + side0) << shift; 09877 temp1L = (mid1 + side1) << shift; 09878 temp2L = (mid2 + side2) << shift; 09879 temp3L = (mid3 + side3) << shift; 09880 09881 temp0R = (mid0 - side0) << shift; 09882 temp1R = (mid1 - side1) << shift; 09883 temp2R = (mid2 - side2) << shift; 09884 temp3R = (mid3 - side3) << shift; 09885 09886 temp0L >>= 16; 09887 temp1L >>= 16; 09888 temp2L >>= 16; 09889 temp3L >>= 16; 09890 09891 temp0R >>= 16; 09892 temp1R >>= 16; 09893 temp2R >>= 16; 09894 temp3R >>= 16; 09895 09896 pOutputSamples[i*8+0] = (drflac_int16)temp0L; 09897 pOutputSamples[i*8+1] = (drflac_int16)temp0R; 09898 pOutputSamples[i*8+2] = (drflac_int16)temp1L; 09899 pOutputSamples[i*8+3] = (drflac_int16)temp1R; 09900 pOutputSamples[i*8+4] = (drflac_int16)temp2L; 09901 pOutputSamples[i*8+5] = (drflac_int16)temp2R; 09902 pOutputSamples[i*8+6] = (drflac_int16)temp3L; 09903 pOutputSamples[i*8+7] = (drflac_int16)temp3R; 09904 } 09905 } else { 09906 for (i = 0; i < frameCount4; ++i) { 09907 drflac_uint32 temp0L; 09908 drflac_uint32 temp1L; 09909 drflac_uint32 temp2L; 09910 drflac_uint32 temp3L; 09911 drflac_uint32 temp0R; 09912 drflac_uint32 temp1R; 09913 drflac_uint32 temp2R; 09914 drflac_uint32 temp3R; 09915 09916 drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09917 drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09918 drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09919 drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09920 09921 drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09922 drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09923 drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09924 drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09925 09926 mid0 = (mid0 << 1) | (side0 & 0x01); 09927 mid1 = (mid1 << 1) | (side1 & 0x01); 09928 mid2 = (mid2 << 1) | (side2 & 0x01); 09929 mid3 = (mid3 << 1) | (side3 & 0x01); 09930 09931 temp0L = ((drflac_int32)(mid0 + side0) >> 1); 09932 temp1L = ((drflac_int32)(mid1 + side1) >> 1); 09933 temp2L = ((drflac_int32)(mid2 + side2) >> 1); 09934 temp3L = ((drflac_int32)(mid3 + side3) >> 1); 09935 09936 temp0R = ((drflac_int32)(mid0 - side0) >> 1); 09937 temp1R = ((drflac_int32)(mid1 - side1) >> 1); 09938 temp2R = ((drflac_int32)(mid2 - side2) >> 1); 09939 temp3R = ((drflac_int32)(mid3 - side3) >> 1); 09940 09941 temp0L >>= 16; 09942 temp1L >>= 16; 09943 temp2L >>= 16; 09944 temp3L >>= 16; 09945 09946 temp0R >>= 16; 09947 temp1R >>= 16; 09948 temp2R >>= 16; 09949 temp3R >>= 16; 09950 09951 pOutputSamples[i*8+0] = (drflac_int16)temp0L; 09952 pOutputSamples[i*8+1] = (drflac_int16)temp0R; 09953 pOutputSamples[i*8+2] = (drflac_int16)temp1L; 09954 pOutputSamples[i*8+3] = (drflac_int16)temp1R; 09955 pOutputSamples[i*8+4] = (drflac_int16)temp2L; 09956 pOutputSamples[i*8+5] = (drflac_int16)temp2R; 09957 pOutputSamples[i*8+6] = (drflac_int16)temp3L; 09958 pOutputSamples[i*8+7] = (drflac_int16)temp3R; 09959 } 09960 } 09961 09962 for (i = (frameCount4 << 2); i < frameCount; ++i) { 09963 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 09964 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 09965 09966 mid = (mid << 1) | (side & 0x01); 09967 09968 pOutputSamples[i*2+0] = (drflac_int16)(((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample) >> 16); 09969 pOutputSamples[i*2+1] = (drflac_int16)(((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample) >> 16); 09970 } 09971 } 09972 09973 #if defined(DRFLAC_SUPPORT_SSE2) 09974 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 09975 { 09976 drflac_uint64 i; 09977 drflac_uint64 frameCount4 = frameCount >> 2; 09978 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 09979 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 09980 drflac_uint32 shift = unusedBitsPerSample; 09981 09982 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 09983 09984 if (shift == 0) { 09985 for (i = 0; i < frameCount4; ++i) { 09986 __m128i mid; 09987 __m128i side; 09988 __m128i left; 09989 __m128i right; 09990 09991 mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 09992 side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 09993 09994 mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01))); 09995 09996 left = _mm_srai_epi32(_mm_add_epi32(mid, side), 1); 09997 right = _mm_srai_epi32(_mm_sub_epi32(mid, side), 1); 09998 09999 left = _mm_srai_epi32(left, 16); 10000 right = _mm_srai_epi32(right, 16); 10001 10002 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right)); 10003 } 10004 10005 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10006 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10007 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10008 10009 mid = (mid << 1) | (side & 0x01); 10010 10011 pOutputSamples[i*2+0] = (drflac_int16)(((drflac_int32)(mid + side) >> 1) >> 16); 10012 pOutputSamples[i*2+1] = (drflac_int16)(((drflac_int32)(mid - side) >> 1) >> 16); 10013 } 10014 } else { 10015 shift -= 1; 10016 for (i = 0; i < frameCount4; ++i) { 10017 __m128i mid; 10018 __m128i side; 10019 __m128i left; 10020 __m128i right; 10021 10022 mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 10023 side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 10024 10025 mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01))); 10026 10027 left = _mm_slli_epi32(_mm_add_epi32(mid, side), shift); 10028 right = _mm_slli_epi32(_mm_sub_epi32(mid, side), shift); 10029 10030 left = _mm_srai_epi32(left, 16); 10031 right = _mm_srai_epi32(right, 16); 10032 10033 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right)); 10034 } 10035 10036 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10037 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10038 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10039 10040 mid = (mid << 1) | (side & 0x01); 10041 10042 pOutputSamples[i*2+0] = (drflac_int16)(((mid + side) << shift) >> 16); 10043 pOutputSamples[i*2+1] = (drflac_int16)(((mid - side) << shift) >> 16); 10044 } 10045 } 10046 } 10047 #endif 10048 10049 #if defined(DRFLAC_SUPPORT_NEON) 10050 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 10051 { 10052 drflac_uint64 i; 10053 drflac_uint64 frameCount4 = frameCount >> 2; 10054 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10055 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10056 drflac_uint32 shift = unusedBitsPerSample; 10057 int32x4_t wbpsShift0_4; /* wbps = Wasted Bits Per Sample */ 10058 int32x4_t wbpsShift1_4; /* wbps = Wasted Bits Per Sample */ 10059 10060 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 10061 10062 wbpsShift0_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 10063 wbpsShift1_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 10064 10065 if (shift == 0) { 10066 for (i = 0; i < frameCount4; ++i) { 10067 uint32x4_t mid; 10068 uint32x4_t side; 10069 int32x4_t left; 10070 int32x4_t right; 10071 10072 mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbpsShift0_4); 10073 side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbpsShift1_4); 10074 10075 mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, vdupq_n_u32(1))); 10076 10077 left = vshrq_n_s32(vreinterpretq_s32_u32(vaddq_u32(mid, side)), 1); 10078 right = vshrq_n_s32(vreinterpretq_s32_u32(vsubq_u32(mid, side)), 1); 10079 10080 left = vshrq_n_s32(left, 16); 10081 right = vshrq_n_s32(right, 16); 10082 10083 drflac__vst2q_s16(pOutputSamples + i*8, vzip_s16(vmovn_s32(left), vmovn_s32(right))); 10084 } 10085 10086 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10087 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10088 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10089 10090 mid = (mid << 1) | (side & 0x01); 10091 10092 pOutputSamples[i*2+0] = (drflac_int16)(((drflac_int32)(mid + side) >> 1) >> 16); 10093 pOutputSamples[i*2+1] = (drflac_int16)(((drflac_int32)(mid - side) >> 1) >> 16); 10094 } 10095 } else { 10096 int32x4_t shift4; 10097 10098 shift -= 1; 10099 shift4 = vdupq_n_s32(shift); 10100 10101 for (i = 0; i < frameCount4; ++i) { 10102 uint32x4_t mid; 10103 uint32x4_t side; 10104 int32x4_t left; 10105 int32x4_t right; 10106 10107 mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbpsShift0_4); 10108 side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbpsShift1_4); 10109 10110 mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, vdupq_n_u32(1))); 10111 10112 left = vreinterpretq_s32_u32(vshlq_u32(vaddq_u32(mid, side), shift4)); 10113 right = vreinterpretq_s32_u32(vshlq_u32(vsubq_u32(mid, side), shift4)); 10114 10115 left = vshrq_n_s32(left, 16); 10116 right = vshrq_n_s32(right, 16); 10117 10118 drflac__vst2q_s16(pOutputSamples + i*8, vzip_s16(vmovn_s32(left), vmovn_s32(right))); 10119 } 10120 10121 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10122 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10123 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10124 10125 mid = (mid << 1) | (side & 0x01); 10126 10127 pOutputSamples[i*2+0] = (drflac_int16)(((mid + side) << shift) >> 16); 10128 pOutputSamples[i*2+1] = (drflac_int16)(((mid - side) << shift) >> 16); 10129 } 10130 } 10131 } 10132 #endif 10133 10134 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_mid_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 10135 { 10136 #if defined(DRFLAC_SUPPORT_SSE2) 10137 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 10138 drflac_read_pcm_frames_s16__decode_mid_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10139 } else 10140 #elif defined(DRFLAC_SUPPORT_NEON) 10141 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 10142 drflac_read_pcm_frames_s16__decode_mid_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10143 } else 10144 #endif 10145 { 10146 /* Scalar fallback. */ 10147 #if 0 10148 drflac_read_pcm_frames_s16__decode_mid_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10149 #else 10150 drflac_read_pcm_frames_s16__decode_mid_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10151 #endif 10152 } 10153 } 10154 10155 10156 #if 0 10157 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 10158 { 10159 for (drflac_uint64 i = 0; i < frameCount; ++i) { 10160 pOutputSamples[i*2+0] = (drflac_int16)((drflac_int32)((drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample)) >> 16); 10161 pOutputSamples[i*2+1] = (drflac_int16)((drflac_int32)((drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample)) >> 16); 10162 } 10163 } 10164 #endif 10165 10166 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 10167 { 10168 drflac_uint64 i; 10169 drflac_uint64 frameCount4 = frameCount >> 2; 10170 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10171 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10172 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10173 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10174 10175 for (i = 0; i < frameCount4; ++i) { 10176 drflac_uint32 tempL0 = pInputSamples0U32[i*4+0] << shift0; 10177 drflac_uint32 tempL1 = pInputSamples0U32[i*4+1] << shift0; 10178 drflac_uint32 tempL2 = pInputSamples0U32[i*4+2] << shift0; 10179 drflac_uint32 tempL3 = pInputSamples0U32[i*4+3] << shift0; 10180 10181 drflac_uint32 tempR0 = pInputSamples1U32[i*4+0] << shift1; 10182 drflac_uint32 tempR1 = pInputSamples1U32[i*4+1] << shift1; 10183 drflac_uint32 tempR2 = pInputSamples1U32[i*4+2] << shift1; 10184 drflac_uint32 tempR3 = pInputSamples1U32[i*4+3] << shift1; 10185 10186 tempL0 >>= 16; 10187 tempL1 >>= 16; 10188 tempL2 >>= 16; 10189 tempL3 >>= 16; 10190 10191 tempR0 >>= 16; 10192 tempR1 >>= 16; 10193 tempR2 >>= 16; 10194 tempR3 >>= 16; 10195 10196 pOutputSamples[i*8+0] = (drflac_int16)tempL0; 10197 pOutputSamples[i*8+1] = (drflac_int16)tempR0; 10198 pOutputSamples[i*8+2] = (drflac_int16)tempL1; 10199 pOutputSamples[i*8+3] = (drflac_int16)tempR1; 10200 pOutputSamples[i*8+4] = (drflac_int16)tempL2; 10201 pOutputSamples[i*8+5] = (drflac_int16)tempR2; 10202 pOutputSamples[i*8+6] = (drflac_int16)tempL3; 10203 pOutputSamples[i*8+7] = (drflac_int16)tempR3; 10204 } 10205 10206 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10207 pOutputSamples[i*2+0] = (drflac_int16)((pInputSamples0U32[i] << shift0) >> 16); 10208 pOutputSamples[i*2+1] = (drflac_int16)((pInputSamples1U32[i] << shift1) >> 16); 10209 } 10210 } 10211 10212 #if defined(DRFLAC_SUPPORT_SSE2) 10213 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 10214 { 10215 drflac_uint64 i; 10216 drflac_uint64 frameCount4 = frameCount >> 2; 10217 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10218 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10219 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10220 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10221 10222 for (i = 0; i < frameCount4; ++i) { 10223 __m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0); 10224 __m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1); 10225 10226 left = _mm_srai_epi32(left, 16); 10227 right = _mm_srai_epi32(right, 16); 10228 10229 /* At this point we have results. We can now pack and interleave these into a single __m128i object and then store the in the output buffer. */ 10230 _mm_storeu_si128((__m128i*)(pOutputSamples + i*8), drflac__mm_packs_interleaved_epi32(left, right)); 10231 } 10232 10233 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10234 pOutputSamples[i*2+0] = (drflac_int16)((pInputSamples0U32[i] << shift0) >> 16); 10235 pOutputSamples[i*2+1] = (drflac_int16)((pInputSamples1U32[i] << shift1) >> 16); 10236 } 10237 } 10238 #endif 10239 10240 #if defined(DRFLAC_SUPPORT_NEON) 10241 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 10242 { 10243 drflac_uint64 i; 10244 drflac_uint64 frameCount4 = frameCount >> 2; 10245 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10246 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10247 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10248 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10249 10250 int32x4_t shift0_4 = vdupq_n_s32(shift0); 10251 int32x4_t shift1_4 = vdupq_n_s32(shift1); 10252 10253 for (i = 0; i < frameCount4; ++i) { 10254 int32x4_t left; 10255 int32x4_t right; 10256 10257 left = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4)); 10258 right = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4)); 10259 10260 left = vshrq_n_s32(left, 16); 10261 right = vshrq_n_s32(right, 16); 10262 10263 drflac__vst2q_s16(pOutputSamples + i*8, vzip_s16(vmovn_s32(left), vmovn_s32(right))); 10264 } 10265 10266 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10267 pOutputSamples[i*2+0] = (drflac_int16)((pInputSamples0U32[i] << shift0) >> 16); 10268 pOutputSamples[i*2+1] = (drflac_int16)((pInputSamples1U32[i] << shift1) >> 16); 10269 } 10270 } 10271 #endif 10272 10273 static DRFLAC_INLINE void drflac_read_pcm_frames_s16__decode_independent_stereo(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, drflac_int16* pOutputSamples) 10274 { 10275 #if defined(DRFLAC_SUPPORT_SSE2) 10276 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 10277 drflac_read_pcm_frames_s16__decode_independent_stereo__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10278 } else 10279 #elif defined(DRFLAC_SUPPORT_NEON) 10280 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 10281 drflac_read_pcm_frames_s16__decode_independent_stereo__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10282 } else 10283 #endif 10284 { 10285 /* Scalar fallback. */ 10286 #if 0 10287 drflac_read_pcm_frames_s16__decode_independent_stereo__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10288 #else 10289 drflac_read_pcm_frames_s16__decode_independent_stereo__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10290 #endif 10291 } 10292 } 10293 10294 DRFLAC_API drflac_uint64 drflac_read_pcm_frames_s16(drflac* pFlac, drflac_uint64 framesToRead, drflac_int16* pBufferOut) 10295 { 10296 drflac_uint64 framesRead; 10297 drflac_uint32 unusedBitsPerSample; 10298 10299 if (pFlac == NULL || framesToRead == 0) { 10300 return 0; 10301 } 10302 10303 if (pBufferOut == NULL) { 10304 return drflac__seek_forward_by_pcm_frames(pFlac, framesToRead); 10305 } 10306 10307 DRFLAC_ASSERT(pFlac->bitsPerSample <= 32); 10308 unusedBitsPerSample = 32 - pFlac->bitsPerSample; 10309 10310 framesRead = 0; 10311 while (framesToRead > 0) { 10312 /* If we've run out of samples in this frame, go to the next. */ 10313 if (pFlac->currentFLACFrame.pcmFramesRemaining == 0) { 10314 if (!drflac__read_and_decode_next_flac_frame(pFlac)) { 10315 break; /* Couldn't read the next frame, so just break from the loop and return. */ 10316 } 10317 } else { 10318 unsigned int channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment); 10319 drflac_uint64 iFirstPCMFrame = pFlac->currentFLACFrame.header.blockSizeInPCMFrames - pFlac->currentFLACFrame.pcmFramesRemaining; 10320 drflac_uint64 frameCountThisIteration = framesToRead; 10321 10322 if (frameCountThisIteration > pFlac->currentFLACFrame.pcmFramesRemaining) { 10323 frameCountThisIteration = pFlac->currentFLACFrame.pcmFramesRemaining; 10324 } 10325 10326 if (channelCount == 2) { 10327 const drflac_int32* pDecodedSamples0 = pFlac->currentFLACFrame.subframes[0].pSamplesS32 + iFirstPCMFrame; 10328 const drflac_int32* pDecodedSamples1 = pFlac->currentFLACFrame.subframes[1].pSamplesS32 + iFirstPCMFrame; 10329 10330 switch (pFlac->currentFLACFrame.header.channelAssignment) 10331 { 10332 case DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE: 10333 { 10334 drflac_read_pcm_frames_s16__decode_left_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 10335 } break; 10336 10337 case DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE: 10338 { 10339 drflac_read_pcm_frames_s16__decode_right_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 10340 } break; 10341 10342 case DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE: 10343 { 10344 drflac_read_pcm_frames_s16__decode_mid_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 10345 } break; 10346 10347 case DRFLAC_CHANNEL_ASSIGNMENT_INDEPENDENT: 10348 default: 10349 { 10350 drflac_read_pcm_frames_s16__decode_independent_stereo(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 10351 } break; 10352 } 10353 } else { 10354 /* Generic interleaving. */ 10355 drflac_uint64 i; 10356 for (i = 0; i < frameCountThisIteration; ++i) { 10357 unsigned int j; 10358 for (j = 0; j < channelCount; ++j) { 10359 drflac_int32 sampleS32 = (drflac_int32)((drflac_uint32)(pFlac->currentFLACFrame.subframes[j].pSamplesS32[iFirstPCMFrame + i]) << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[j].wastedBitsPerSample)); 10360 pBufferOut[(i*channelCount)+j] = (drflac_int16)(sampleS32 >> 16); 10361 } 10362 } 10363 } 10364 10365 framesRead += frameCountThisIteration; 10366 pBufferOut += frameCountThisIteration * channelCount; 10367 framesToRead -= frameCountThisIteration; 10368 pFlac->currentPCMFrame += frameCountThisIteration; 10369 pFlac->currentFLACFrame.pcmFramesRemaining -= (drflac_uint32)frameCountThisIteration; 10370 } 10371 } 10372 10373 return framesRead; 10374 } 10375 10376 10377 #if 0 10378 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10379 { 10380 drflac_uint64 i; 10381 for (i = 0; i < frameCount; ++i) { 10382 drflac_uint32 left = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 10383 drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 10384 drflac_uint32 right = left - side; 10385 10386 pOutputSamples[i*2+0] = (float)((drflac_int32)left / 2147483648.0); 10387 pOutputSamples[i*2+1] = (float)((drflac_int32)right / 2147483648.0); 10388 } 10389 } 10390 #endif 10391 10392 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10393 { 10394 drflac_uint64 i; 10395 drflac_uint64 frameCount4 = frameCount >> 2; 10396 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10397 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10398 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10399 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10400 10401 float factor = 1 / 2147483648.0; 10402 10403 for (i = 0; i < frameCount4; ++i) { 10404 drflac_uint32 left0 = pInputSamples0U32[i*4+0] << shift0; 10405 drflac_uint32 left1 = pInputSamples0U32[i*4+1] << shift0; 10406 drflac_uint32 left2 = pInputSamples0U32[i*4+2] << shift0; 10407 drflac_uint32 left3 = pInputSamples0U32[i*4+3] << shift0; 10408 10409 drflac_uint32 side0 = pInputSamples1U32[i*4+0] << shift1; 10410 drflac_uint32 side1 = pInputSamples1U32[i*4+1] << shift1; 10411 drflac_uint32 side2 = pInputSamples1U32[i*4+2] << shift1; 10412 drflac_uint32 side3 = pInputSamples1U32[i*4+3] << shift1; 10413 10414 drflac_uint32 right0 = left0 - side0; 10415 drflac_uint32 right1 = left1 - side1; 10416 drflac_uint32 right2 = left2 - side2; 10417 drflac_uint32 right3 = left3 - side3; 10418 10419 pOutputSamples[i*8+0] = (drflac_int32)left0 * factor; 10420 pOutputSamples[i*8+1] = (drflac_int32)right0 * factor; 10421 pOutputSamples[i*8+2] = (drflac_int32)left1 * factor; 10422 pOutputSamples[i*8+3] = (drflac_int32)right1 * factor; 10423 pOutputSamples[i*8+4] = (drflac_int32)left2 * factor; 10424 pOutputSamples[i*8+5] = (drflac_int32)right2 * factor; 10425 pOutputSamples[i*8+6] = (drflac_int32)left3 * factor; 10426 pOutputSamples[i*8+7] = (drflac_int32)right3 * factor; 10427 } 10428 10429 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10430 drflac_uint32 left = pInputSamples0U32[i] << shift0; 10431 drflac_uint32 side = pInputSamples1U32[i] << shift1; 10432 drflac_uint32 right = left - side; 10433 10434 pOutputSamples[i*2+0] = (drflac_int32)left * factor; 10435 pOutputSamples[i*2+1] = (drflac_int32)right * factor; 10436 } 10437 } 10438 10439 #if defined(DRFLAC_SUPPORT_SSE2) 10440 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10441 { 10442 drflac_uint64 i; 10443 drflac_uint64 frameCount4 = frameCount >> 2; 10444 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10445 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10446 drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8; 10447 drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8; 10448 __m128 factor; 10449 10450 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 10451 10452 factor = _mm_set1_ps(1.0f / 8388608.0f); 10453 10454 for (i = 0; i < frameCount4; ++i) { 10455 __m128i left = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0); 10456 __m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1); 10457 __m128i right = _mm_sub_epi32(left, side); 10458 __m128 leftf = _mm_mul_ps(_mm_cvtepi32_ps(left), factor); 10459 __m128 rightf = _mm_mul_ps(_mm_cvtepi32_ps(right), factor); 10460 10461 _mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf)); 10462 _mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf)); 10463 } 10464 10465 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10466 drflac_uint32 left = pInputSamples0U32[i] << shift0; 10467 drflac_uint32 side = pInputSamples1U32[i] << shift1; 10468 drflac_uint32 right = left - side; 10469 10470 pOutputSamples[i*2+0] = (drflac_int32)left / 8388608.0f; 10471 pOutputSamples[i*2+1] = (drflac_int32)right / 8388608.0f; 10472 } 10473 } 10474 #endif 10475 10476 #if defined(DRFLAC_SUPPORT_NEON) 10477 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10478 { 10479 drflac_uint64 i; 10480 drflac_uint64 frameCount4 = frameCount >> 2; 10481 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10482 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10483 drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8; 10484 drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8; 10485 float32x4_t factor4; 10486 int32x4_t shift0_4; 10487 int32x4_t shift1_4; 10488 10489 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 10490 10491 factor4 = vdupq_n_f32(1.0f / 8388608.0f); 10492 shift0_4 = vdupq_n_s32(shift0); 10493 shift1_4 = vdupq_n_s32(shift1); 10494 10495 for (i = 0; i < frameCount4; ++i) { 10496 uint32x4_t left; 10497 uint32x4_t side; 10498 uint32x4_t right; 10499 float32x4_t leftf; 10500 float32x4_t rightf; 10501 10502 left = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4); 10503 side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4); 10504 right = vsubq_u32(left, side); 10505 leftf = vmulq_f32(vcvtq_f32_s32(vreinterpretq_s32_u32(left)), factor4); 10506 rightf = vmulq_f32(vcvtq_f32_s32(vreinterpretq_s32_u32(right)), factor4); 10507 10508 drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf)); 10509 } 10510 10511 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10512 drflac_uint32 left = pInputSamples0U32[i] << shift0; 10513 drflac_uint32 side = pInputSamples1U32[i] << shift1; 10514 drflac_uint32 right = left - side; 10515 10516 pOutputSamples[i*2+0] = (drflac_int32)left / 8388608.0f; 10517 pOutputSamples[i*2+1] = (drflac_int32)right / 8388608.0f; 10518 } 10519 } 10520 #endif 10521 10522 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_left_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10523 { 10524 #if defined(DRFLAC_SUPPORT_SSE2) 10525 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 10526 drflac_read_pcm_frames_f32__decode_left_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10527 } else 10528 #elif defined(DRFLAC_SUPPORT_NEON) 10529 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 10530 drflac_read_pcm_frames_f32__decode_left_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10531 } else 10532 #endif 10533 { 10534 /* Scalar fallback. */ 10535 #if 0 10536 drflac_read_pcm_frames_f32__decode_left_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10537 #else 10538 drflac_read_pcm_frames_f32__decode_left_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10539 #endif 10540 } 10541 } 10542 10543 10544 #if 0 10545 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10546 { 10547 drflac_uint64 i; 10548 for (i = 0; i < frameCount; ++i) { 10549 drflac_uint32 side = (drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 10550 drflac_uint32 right = (drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 10551 drflac_uint32 left = right + side; 10552 10553 pOutputSamples[i*2+0] = (float)((drflac_int32)left / 2147483648.0); 10554 pOutputSamples[i*2+1] = (float)((drflac_int32)right / 2147483648.0); 10555 } 10556 } 10557 #endif 10558 10559 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10560 { 10561 drflac_uint64 i; 10562 drflac_uint64 frameCount4 = frameCount >> 2; 10563 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10564 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10565 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10566 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10567 float factor = 1 / 2147483648.0; 10568 10569 for (i = 0; i < frameCount4; ++i) { 10570 drflac_uint32 side0 = pInputSamples0U32[i*4+0] << shift0; 10571 drflac_uint32 side1 = pInputSamples0U32[i*4+1] << shift0; 10572 drflac_uint32 side2 = pInputSamples0U32[i*4+2] << shift0; 10573 drflac_uint32 side3 = pInputSamples0U32[i*4+3] << shift0; 10574 10575 drflac_uint32 right0 = pInputSamples1U32[i*4+0] << shift1; 10576 drflac_uint32 right1 = pInputSamples1U32[i*4+1] << shift1; 10577 drflac_uint32 right2 = pInputSamples1U32[i*4+2] << shift1; 10578 drflac_uint32 right3 = pInputSamples1U32[i*4+3] << shift1; 10579 10580 drflac_uint32 left0 = right0 + side0; 10581 drflac_uint32 left1 = right1 + side1; 10582 drflac_uint32 left2 = right2 + side2; 10583 drflac_uint32 left3 = right3 + side3; 10584 10585 pOutputSamples[i*8+0] = (drflac_int32)left0 * factor; 10586 pOutputSamples[i*8+1] = (drflac_int32)right0 * factor; 10587 pOutputSamples[i*8+2] = (drflac_int32)left1 * factor; 10588 pOutputSamples[i*8+3] = (drflac_int32)right1 * factor; 10589 pOutputSamples[i*8+4] = (drflac_int32)left2 * factor; 10590 pOutputSamples[i*8+5] = (drflac_int32)right2 * factor; 10591 pOutputSamples[i*8+6] = (drflac_int32)left3 * factor; 10592 pOutputSamples[i*8+7] = (drflac_int32)right3 * factor; 10593 } 10594 10595 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10596 drflac_uint32 side = pInputSamples0U32[i] << shift0; 10597 drflac_uint32 right = pInputSamples1U32[i] << shift1; 10598 drflac_uint32 left = right + side; 10599 10600 pOutputSamples[i*2+0] = (drflac_int32)left * factor; 10601 pOutputSamples[i*2+1] = (drflac_int32)right * factor; 10602 } 10603 } 10604 10605 #if defined(DRFLAC_SUPPORT_SSE2) 10606 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10607 { 10608 drflac_uint64 i; 10609 drflac_uint64 frameCount4 = frameCount >> 2; 10610 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10611 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10612 drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8; 10613 drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8; 10614 __m128 factor; 10615 10616 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 10617 10618 factor = _mm_set1_ps(1.0f / 8388608.0f); 10619 10620 for (i = 0; i < frameCount4; ++i) { 10621 __m128i side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0); 10622 __m128i right = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1); 10623 __m128i left = _mm_add_epi32(right, side); 10624 __m128 leftf = _mm_mul_ps(_mm_cvtepi32_ps(left), factor); 10625 __m128 rightf = _mm_mul_ps(_mm_cvtepi32_ps(right), factor); 10626 10627 _mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf)); 10628 _mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf)); 10629 } 10630 10631 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10632 drflac_uint32 side = pInputSamples0U32[i] << shift0; 10633 drflac_uint32 right = pInputSamples1U32[i] << shift1; 10634 drflac_uint32 left = right + side; 10635 10636 pOutputSamples[i*2+0] = (drflac_int32)left / 8388608.0f; 10637 pOutputSamples[i*2+1] = (drflac_int32)right / 8388608.0f; 10638 } 10639 } 10640 #endif 10641 10642 #if defined(DRFLAC_SUPPORT_NEON) 10643 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10644 { 10645 drflac_uint64 i; 10646 drflac_uint64 frameCount4 = frameCount >> 2; 10647 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10648 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10649 drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8; 10650 drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8; 10651 float32x4_t factor4; 10652 int32x4_t shift0_4; 10653 int32x4_t shift1_4; 10654 10655 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 10656 10657 factor4 = vdupq_n_f32(1.0f / 8388608.0f); 10658 shift0_4 = vdupq_n_s32(shift0); 10659 shift1_4 = vdupq_n_s32(shift1); 10660 10661 for (i = 0; i < frameCount4; ++i) { 10662 uint32x4_t side; 10663 uint32x4_t right; 10664 uint32x4_t left; 10665 float32x4_t leftf; 10666 float32x4_t rightf; 10667 10668 side = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4); 10669 right = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4); 10670 left = vaddq_u32(right, side); 10671 leftf = vmulq_f32(vcvtq_f32_s32(vreinterpretq_s32_u32(left)), factor4); 10672 rightf = vmulq_f32(vcvtq_f32_s32(vreinterpretq_s32_u32(right)), factor4); 10673 10674 drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf)); 10675 } 10676 10677 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10678 drflac_uint32 side = pInputSamples0U32[i] << shift0; 10679 drflac_uint32 right = pInputSamples1U32[i] << shift1; 10680 drflac_uint32 left = right + side; 10681 10682 pOutputSamples[i*2+0] = (drflac_int32)left / 8388608.0f; 10683 pOutputSamples[i*2+1] = (drflac_int32)right / 8388608.0f; 10684 } 10685 } 10686 #endif 10687 10688 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_right_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10689 { 10690 #if defined(DRFLAC_SUPPORT_SSE2) 10691 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 10692 drflac_read_pcm_frames_f32__decode_right_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10693 } else 10694 #elif defined(DRFLAC_SUPPORT_NEON) 10695 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 10696 drflac_read_pcm_frames_f32__decode_right_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10697 } else 10698 #endif 10699 { 10700 /* Scalar fallback. */ 10701 #if 0 10702 drflac_read_pcm_frames_f32__decode_right_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10703 #else 10704 drflac_read_pcm_frames_f32__decode_right_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 10705 #endif 10706 } 10707 } 10708 10709 10710 #if 0 10711 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10712 { 10713 for (drflac_uint64 i = 0; i < frameCount; ++i) { 10714 drflac_uint32 mid = (drflac_uint32)pInputSamples0[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10715 drflac_uint32 side = (drflac_uint32)pInputSamples1[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10716 10717 mid = (mid << 1) | (side & 0x01); 10718 10719 pOutputSamples[i*2+0] = (float)((((drflac_int32)(mid + side) >> 1) << (unusedBitsPerSample)) / 2147483648.0); 10720 pOutputSamples[i*2+1] = (float)((((drflac_int32)(mid - side) >> 1) << (unusedBitsPerSample)) / 2147483648.0); 10721 } 10722 } 10723 #endif 10724 10725 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10726 { 10727 drflac_uint64 i; 10728 drflac_uint64 frameCount4 = frameCount >> 2; 10729 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10730 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10731 drflac_uint32 shift = unusedBitsPerSample; 10732 float factor = 1 / 2147483648.0; 10733 10734 if (shift > 0) { 10735 shift -= 1; 10736 for (i = 0; i < frameCount4; ++i) { 10737 drflac_uint32 temp0L; 10738 drflac_uint32 temp1L; 10739 drflac_uint32 temp2L; 10740 drflac_uint32 temp3L; 10741 drflac_uint32 temp0R; 10742 drflac_uint32 temp1R; 10743 drflac_uint32 temp2R; 10744 drflac_uint32 temp3R; 10745 10746 drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10747 drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10748 drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10749 drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10750 10751 drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10752 drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10753 drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10754 drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10755 10756 mid0 = (mid0 << 1) | (side0 & 0x01); 10757 mid1 = (mid1 << 1) | (side1 & 0x01); 10758 mid2 = (mid2 << 1) | (side2 & 0x01); 10759 mid3 = (mid3 << 1) | (side3 & 0x01); 10760 10761 temp0L = (mid0 + side0) << shift; 10762 temp1L = (mid1 + side1) << shift; 10763 temp2L = (mid2 + side2) << shift; 10764 temp3L = (mid3 + side3) << shift; 10765 10766 temp0R = (mid0 - side0) << shift; 10767 temp1R = (mid1 - side1) << shift; 10768 temp2R = (mid2 - side2) << shift; 10769 temp3R = (mid3 - side3) << shift; 10770 10771 pOutputSamples[i*8+0] = (drflac_int32)temp0L * factor; 10772 pOutputSamples[i*8+1] = (drflac_int32)temp0R * factor; 10773 pOutputSamples[i*8+2] = (drflac_int32)temp1L * factor; 10774 pOutputSamples[i*8+3] = (drflac_int32)temp1R * factor; 10775 pOutputSamples[i*8+4] = (drflac_int32)temp2L * factor; 10776 pOutputSamples[i*8+5] = (drflac_int32)temp2R * factor; 10777 pOutputSamples[i*8+6] = (drflac_int32)temp3L * factor; 10778 pOutputSamples[i*8+7] = (drflac_int32)temp3R * factor; 10779 } 10780 } else { 10781 for (i = 0; i < frameCount4; ++i) { 10782 drflac_uint32 temp0L; 10783 drflac_uint32 temp1L; 10784 drflac_uint32 temp2L; 10785 drflac_uint32 temp3L; 10786 drflac_uint32 temp0R; 10787 drflac_uint32 temp1R; 10788 drflac_uint32 temp2R; 10789 drflac_uint32 temp3R; 10790 10791 drflac_uint32 mid0 = pInputSamples0U32[i*4+0] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10792 drflac_uint32 mid1 = pInputSamples0U32[i*4+1] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10793 drflac_uint32 mid2 = pInputSamples0U32[i*4+2] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10794 drflac_uint32 mid3 = pInputSamples0U32[i*4+3] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10795 10796 drflac_uint32 side0 = pInputSamples1U32[i*4+0] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10797 drflac_uint32 side1 = pInputSamples1U32[i*4+1] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10798 drflac_uint32 side2 = pInputSamples1U32[i*4+2] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10799 drflac_uint32 side3 = pInputSamples1U32[i*4+3] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10800 10801 mid0 = (mid0 << 1) | (side0 & 0x01); 10802 mid1 = (mid1 << 1) | (side1 & 0x01); 10803 mid2 = (mid2 << 1) | (side2 & 0x01); 10804 mid3 = (mid3 << 1) | (side3 & 0x01); 10805 10806 temp0L = (drflac_uint32)((drflac_int32)(mid0 + side0) >> 1); 10807 temp1L = (drflac_uint32)((drflac_int32)(mid1 + side1) >> 1); 10808 temp2L = (drflac_uint32)((drflac_int32)(mid2 + side2) >> 1); 10809 temp3L = (drflac_uint32)((drflac_int32)(mid3 + side3) >> 1); 10810 10811 temp0R = (drflac_uint32)((drflac_int32)(mid0 - side0) >> 1); 10812 temp1R = (drflac_uint32)((drflac_int32)(mid1 - side1) >> 1); 10813 temp2R = (drflac_uint32)((drflac_int32)(mid2 - side2) >> 1); 10814 temp3R = (drflac_uint32)((drflac_int32)(mid3 - side3) >> 1); 10815 10816 pOutputSamples[i*8+0] = (drflac_int32)temp0L * factor; 10817 pOutputSamples[i*8+1] = (drflac_int32)temp0R * factor; 10818 pOutputSamples[i*8+2] = (drflac_int32)temp1L * factor; 10819 pOutputSamples[i*8+3] = (drflac_int32)temp1R * factor; 10820 pOutputSamples[i*8+4] = (drflac_int32)temp2L * factor; 10821 pOutputSamples[i*8+5] = (drflac_int32)temp2R * factor; 10822 pOutputSamples[i*8+6] = (drflac_int32)temp3L * factor; 10823 pOutputSamples[i*8+7] = (drflac_int32)temp3R * factor; 10824 } 10825 } 10826 10827 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10828 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10829 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10830 10831 mid = (mid << 1) | (side & 0x01); 10832 10833 pOutputSamples[i*2+0] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid + side) >> 1) << unusedBitsPerSample) * factor; 10834 pOutputSamples[i*2+1] = (drflac_int32)((drflac_uint32)((drflac_int32)(mid - side) >> 1) << unusedBitsPerSample) * factor; 10835 } 10836 } 10837 10838 #if defined(DRFLAC_SUPPORT_SSE2) 10839 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10840 { 10841 drflac_uint64 i; 10842 drflac_uint64 frameCount4 = frameCount >> 2; 10843 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10844 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10845 drflac_uint32 shift = unusedBitsPerSample - 8; 10846 float factor; 10847 __m128 factor128; 10848 10849 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 10850 10851 factor = 1.0f / 8388608.0f; 10852 factor128 = _mm_set1_ps(factor); 10853 10854 if (shift == 0) { 10855 for (i = 0; i < frameCount4; ++i) { 10856 __m128i mid; 10857 __m128i side; 10858 __m128i tempL; 10859 __m128i tempR; 10860 __m128 leftf; 10861 __m128 rightf; 10862 10863 mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 10864 side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 10865 10866 mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01))); 10867 10868 tempL = _mm_srai_epi32(_mm_add_epi32(mid, side), 1); 10869 tempR = _mm_srai_epi32(_mm_sub_epi32(mid, side), 1); 10870 10871 leftf = _mm_mul_ps(_mm_cvtepi32_ps(tempL), factor128); 10872 rightf = _mm_mul_ps(_mm_cvtepi32_ps(tempR), factor128); 10873 10874 _mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf)); 10875 _mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf)); 10876 } 10877 10878 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10879 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10880 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10881 10882 mid = (mid << 1) | (side & 0x01); 10883 10884 pOutputSamples[i*2+0] = ((drflac_int32)(mid + side) >> 1) * factor; 10885 pOutputSamples[i*2+1] = ((drflac_int32)(mid - side) >> 1) * factor; 10886 } 10887 } else { 10888 shift -= 1; 10889 for (i = 0; i < frameCount4; ++i) { 10890 __m128i mid; 10891 __m128i side; 10892 __m128i tempL; 10893 __m128i tempR; 10894 __m128 leftf; 10895 __m128 rightf; 10896 10897 mid = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 10898 side = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 10899 10900 mid = _mm_or_si128(_mm_slli_epi32(mid, 1), _mm_and_si128(side, _mm_set1_epi32(0x01))); 10901 10902 tempL = _mm_slli_epi32(_mm_add_epi32(mid, side), shift); 10903 tempR = _mm_slli_epi32(_mm_sub_epi32(mid, side), shift); 10904 10905 leftf = _mm_mul_ps(_mm_cvtepi32_ps(tempL), factor128); 10906 rightf = _mm_mul_ps(_mm_cvtepi32_ps(tempR), factor128); 10907 10908 _mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf)); 10909 _mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf)); 10910 } 10911 10912 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10913 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10914 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10915 10916 mid = (mid << 1) | (side & 0x01); 10917 10918 pOutputSamples[i*2+0] = (drflac_int32)((mid + side) << shift) * factor; 10919 pOutputSamples[i*2+1] = (drflac_int32)((mid - side) << shift) * factor; 10920 } 10921 } 10922 } 10923 #endif 10924 10925 #if defined(DRFLAC_SUPPORT_NEON) 10926 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 10927 { 10928 drflac_uint64 i; 10929 drflac_uint64 frameCount4 = frameCount >> 2; 10930 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 10931 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 10932 drflac_uint32 shift = unusedBitsPerSample - 8; 10933 float factor; 10934 float32x4_t factor4; 10935 int32x4_t shift4; 10936 int32x4_t wbps0_4; /* Wasted Bits Per Sample */ 10937 int32x4_t wbps1_4; /* Wasted Bits Per Sample */ 10938 10939 DRFLAC_ASSERT(pFlac->bitsPerSample <= 24); 10940 10941 factor = 1.0f / 8388608.0f; 10942 factor4 = vdupq_n_f32(factor); 10943 wbps0_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample); 10944 wbps1_4 = vdupq_n_s32(pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample); 10945 10946 if (shift == 0) { 10947 for (i = 0; i < frameCount4; ++i) { 10948 int32x4_t lefti; 10949 int32x4_t righti; 10950 float32x4_t leftf; 10951 float32x4_t rightf; 10952 10953 uint32x4_t mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbps0_4); 10954 uint32x4_t side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbps1_4); 10955 10956 mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, vdupq_n_u32(1))); 10957 10958 lefti = vshrq_n_s32(vreinterpretq_s32_u32(vaddq_u32(mid, side)), 1); 10959 righti = vshrq_n_s32(vreinterpretq_s32_u32(vsubq_u32(mid, side)), 1); 10960 10961 leftf = vmulq_f32(vcvtq_f32_s32(lefti), factor4); 10962 rightf = vmulq_f32(vcvtq_f32_s32(righti), factor4); 10963 10964 drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf)); 10965 } 10966 10967 for (i = (frameCount4 << 2); i < frameCount; ++i) { 10968 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 10969 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 10970 10971 mid = (mid << 1) | (side & 0x01); 10972 10973 pOutputSamples[i*2+0] = ((drflac_int32)(mid + side) >> 1) * factor; 10974 pOutputSamples[i*2+1] = ((drflac_int32)(mid - side) >> 1) * factor; 10975 } 10976 } else { 10977 shift -= 1; 10978 shift4 = vdupq_n_s32(shift); 10979 for (i = 0; i < frameCount4; ++i) { 10980 uint32x4_t mid; 10981 uint32x4_t side; 10982 int32x4_t lefti; 10983 int32x4_t righti; 10984 float32x4_t leftf; 10985 float32x4_t rightf; 10986 10987 mid = vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), wbps0_4); 10988 side = vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), wbps1_4); 10989 10990 mid = vorrq_u32(vshlq_n_u32(mid, 1), vandq_u32(side, vdupq_n_u32(1))); 10991 10992 lefti = vreinterpretq_s32_u32(vshlq_u32(vaddq_u32(mid, side), shift4)); 10993 righti = vreinterpretq_s32_u32(vshlq_u32(vsubq_u32(mid, side), shift4)); 10994 10995 leftf = vmulq_f32(vcvtq_f32_s32(lefti), factor4); 10996 rightf = vmulq_f32(vcvtq_f32_s32(righti), factor4); 10997 10998 drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf)); 10999 } 11000 11001 for (i = (frameCount4 << 2); i < frameCount; ++i) { 11002 drflac_uint32 mid = pInputSamples0U32[i] << pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 11003 drflac_uint32 side = pInputSamples1U32[i] << pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 11004 11005 mid = (mid << 1) | (side & 0x01); 11006 11007 pOutputSamples[i*2+0] = (drflac_int32)((mid + side) << shift) * factor; 11008 pOutputSamples[i*2+1] = (drflac_int32)((mid - side) << shift) * factor; 11009 } 11010 } 11011 } 11012 #endif 11013 11014 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_mid_side(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 11015 { 11016 #if defined(DRFLAC_SUPPORT_SSE2) 11017 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 11018 drflac_read_pcm_frames_f32__decode_mid_side__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 11019 } else 11020 #elif defined(DRFLAC_SUPPORT_NEON) 11021 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 11022 drflac_read_pcm_frames_f32__decode_mid_side__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 11023 } else 11024 #endif 11025 { 11026 /* Scalar fallback. */ 11027 #if 0 11028 drflac_read_pcm_frames_f32__decode_mid_side__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 11029 #else 11030 drflac_read_pcm_frames_f32__decode_mid_side__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 11031 #endif 11032 } 11033 } 11034 11035 #if 0 11036 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo__reference(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 11037 { 11038 for (drflac_uint64 i = 0; i < frameCount; ++i) { 11039 pOutputSamples[i*2+0] = (float)((drflac_int32)((drflac_uint32)pInputSamples0[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample)) / 2147483648.0); 11040 pOutputSamples[i*2+1] = (float)((drflac_int32)((drflac_uint32)pInputSamples1[i] << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample)) / 2147483648.0); 11041 } 11042 } 11043 #endif 11044 11045 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo__scalar(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 11046 { 11047 drflac_uint64 i; 11048 drflac_uint64 frameCount4 = frameCount >> 2; 11049 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 11050 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 11051 drflac_uint32 shift0 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample; 11052 drflac_uint32 shift1 = unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample; 11053 float factor = 1 / 2147483648.0; 11054 11055 for (i = 0; i < frameCount4; ++i) { 11056 drflac_uint32 tempL0 = pInputSamples0U32[i*4+0] << shift0; 11057 drflac_uint32 tempL1 = pInputSamples0U32[i*4+1] << shift0; 11058 drflac_uint32 tempL2 = pInputSamples0U32[i*4+2] << shift0; 11059 drflac_uint32 tempL3 = pInputSamples0U32[i*4+3] << shift0; 11060 11061 drflac_uint32 tempR0 = pInputSamples1U32[i*4+0] << shift1; 11062 drflac_uint32 tempR1 = pInputSamples1U32[i*4+1] << shift1; 11063 drflac_uint32 tempR2 = pInputSamples1U32[i*4+2] << shift1; 11064 drflac_uint32 tempR3 = pInputSamples1U32[i*4+3] << shift1; 11065 11066 pOutputSamples[i*8+0] = (drflac_int32)tempL0 * factor; 11067 pOutputSamples[i*8+1] = (drflac_int32)tempR0 * factor; 11068 pOutputSamples[i*8+2] = (drflac_int32)tempL1 * factor; 11069 pOutputSamples[i*8+3] = (drflac_int32)tempR1 * factor; 11070 pOutputSamples[i*8+4] = (drflac_int32)tempL2 * factor; 11071 pOutputSamples[i*8+5] = (drflac_int32)tempR2 * factor; 11072 pOutputSamples[i*8+6] = (drflac_int32)tempL3 * factor; 11073 pOutputSamples[i*8+7] = (drflac_int32)tempR3 * factor; 11074 } 11075 11076 for (i = (frameCount4 << 2); i < frameCount; ++i) { 11077 pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0) * factor; 11078 pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1) * factor; 11079 } 11080 } 11081 11082 #if defined(DRFLAC_SUPPORT_SSE2) 11083 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo__sse2(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 11084 { 11085 drflac_uint64 i; 11086 drflac_uint64 frameCount4 = frameCount >> 2; 11087 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 11088 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 11089 drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8; 11090 drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8; 11091 11092 float factor = 1.0f / 8388608.0f; 11093 __m128 factor128 = _mm_set1_ps(factor); 11094 11095 for (i = 0; i < frameCount4; ++i) { 11096 __m128i lefti; 11097 __m128i righti; 11098 __m128 leftf; 11099 __m128 rightf; 11100 11101 lefti = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples0 + i), shift0); 11102 righti = _mm_slli_epi32(_mm_loadu_si128((const __m128i*)pInputSamples1 + i), shift1); 11103 11104 leftf = _mm_mul_ps(_mm_cvtepi32_ps(lefti), factor128); 11105 rightf = _mm_mul_ps(_mm_cvtepi32_ps(righti), factor128); 11106 11107 _mm_storeu_ps(pOutputSamples + i*8 + 0, _mm_unpacklo_ps(leftf, rightf)); 11108 _mm_storeu_ps(pOutputSamples + i*8 + 4, _mm_unpackhi_ps(leftf, rightf)); 11109 } 11110 11111 for (i = (frameCount4 << 2); i < frameCount; ++i) { 11112 pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0) * factor; 11113 pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1) * factor; 11114 } 11115 } 11116 #endif 11117 11118 #if defined(DRFLAC_SUPPORT_NEON) 11119 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo__neon(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 11120 { 11121 drflac_uint64 i; 11122 drflac_uint64 frameCount4 = frameCount >> 2; 11123 const drflac_uint32* pInputSamples0U32 = (const drflac_uint32*)pInputSamples0; 11124 const drflac_uint32* pInputSamples1U32 = (const drflac_uint32*)pInputSamples1; 11125 drflac_uint32 shift0 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[0].wastedBitsPerSample) - 8; 11126 drflac_uint32 shift1 = (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[1].wastedBitsPerSample) - 8; 11127 11128 float factor = 1.0f / 8388608.0f; 11129 float32x4_t factor4 = vdupq_n_f32(factor); 11130 int32x4_t shift0_4 = vdupq_n_s32(shift0); 11131 int32x4_t shift1_4 = vdupq_n_s32(shift1); 11132 11133 for (i = 0; i < frameCount4; ++i) { 11134 int32x4_t lefti; 11135 int32x4_t righti; 11136 float32x4_t leftf; 11137 float32x4_t rightf; 11138 11139 lefti = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples0U32 + i*4), shift0_4)); 11140 righti = vreinterpretq_s32_u32(vshlq_u32(vld1q_u32(pInputSamples1U32 + i*4), shift1_4)); 11141 11142 leftf = vmulq_f32(vcvtq_f32_s32(lefti), factor4); 11143 rightf = vmulq_f32(vcvtq_f32_s32(righti), factor4); 11144 11145 drflac__vst2q_f32(pOutputSamples + i*8, vzipq_f32(leftf, rightf)); 11146 } 11147 11148 for (i = (frameCount4 << 2); i < frameCount; ++i) { 11149 pOutputSamples[i*2+0] = (drflac_int32)(pInputSamples0U32[i] << shift0) * factor; 11150 pOutputSamples[i*2+1] = (drflac_int32)(pInputSamples1U32[i] << shift1) * factor; 11151 } 11152 } 11153 #endif 11154 11155 static DRFLAC_INLINE void drflac_read_pcm_frames_f32__decode_independent_stereo(drflac* pFlac, drflac_uint64 frameCount, drflac_uint32 unusedBitsPerSample, const drflac_int32* pInputSamples0, const drflac_int32* pInputSamples1, float* pOutputSamples) 11156 { 11157 #if defined(DRFLAC_SUPPORT_SSE2) 11158 if (drflac__gIsSSE2Supported && pFlac->bitsPerSample <= 24) { 11159 drflac_read_pcm_frames_f32__decode_independent_stereo__sse2(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 11160 } else 11161 #elif defined(DRFLAC_SUPPORT_NEON) 11162 if (drflac__gIsNEONSupported && pFlac->bitsPerSample <= 24) { 11163 drflac_read_pcm_frames_f32__decode_independent_stereo__neon(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 11164 } else 11165 #endif 11166 { 11167 /* Scalar fallback. */ 11168 #if 0 11169 drflac_read_pcm_frames_f32__decode_independent_stereo__reference(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 11170 #else 11171 drflac_read_pcm_frames_f32__decode_independent_stereo__scalar(pFlac, frameCount, unusedBitsPerSample, pInputSamples0, pInputSamples1, pOutputSamples); 11172 #endif 11173 } 11174 } 11175 11176 DRFLAC_API drflac_uint64 drflac_read_pcm_frames_f32(drflac* pFlac, drflac_uint64 framesToRead, float* pBufferOut) 11177 { 11178 drflac_uint64 framesRead; 11179 drflac_uint32 unusedBitsPerSample; 11180 11181 if (pFlac == NULL || framesToRead == 0) { 11182 return 0; 11183 } 11184 11185 if (pBufferOut == NULL) { 11186 return drflac__seek_forward_by_pcm_frames(pFlac, framesToRead); 11187 } 11188 11189 DRFLAC_ASSERT(pFlac->bitsPerSample <= 32); 11190 unusedBitsPerSample = 32 - pFlac->bitsPerSample; 11191 11192 framesRead = 0; 11193 while (framesToRead > 0) { 11194 /* If we've run out of samples in this frame, go to the next. */ 11195 if (pFlac->currentFLACFrame.pcmFramesRemaining == 0) { 11196 if (!drflac__read_and_decode_next_flac_frame(pFlac)) { 11197 break; /* Couldn't read the next frame, so just break from the loop and return. */ 11198 } 11199 } else { 11200 unsigned int channelCount = drflac__get_channel_count_from_channel_assignment(pFlac->currentFLACFrame.header.channelAssignment); 11201 drflac_uint64 iFirstPCMFrame = pFlac->currentFLACFrame.header.blockSizeInPCMFrames - pFlac->currentFLACFrame.pcmFramesRemaining; 11202 drflac_uint64 frameCountThisIteration = framesToRead; 11203 11204 if (frameCountThisIteration > pFlac->currentFLACFrame.pcmFramesRemaining) { 11205 frameCountThisIteration = pFlac->currentFLACFrame.pcmFramesRemaining; 11206 } 11207 11208 if (channelCount == 2) { 11209 const drflac_int32* pDecodedSamples0 = pFlac->currentFLACFrame.subframes[0].pSamplesS32 + iFirstPCMFrame; 11210 const drflac_int32* pDecodedSamples1 = pFlac->currentFLACFrame.subframes[1].pSamplesS32 + iFirstPCMFrame; 11211 11212 switch (pFlac->currentFLACFrame.header.channelAssignment) 11213 { 11214 case DRFLAC_CHANNEL_ASSIGNMENT_LEFT_SIDE: 11215 { 11216 drflac_read_pcm_frames_f32__decode_left_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 11217 } break; 11218 11219 case DRFLAC_CHANNEL_ASSIGNMENT_RIGHT_SIDE: 11220 { 11221 drflac_read_pcm_frames_f32__decode_right_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 11222 } break; 11223 11224 case DRFLAC_CHANNEL_ASSIGNMENT_MID_SIDE: 11225 { 11226 drflac_read_pcm_frames_f32__decode_mid_side(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 11227 } break; 11228 11229 case DRFLAC_CHANNEL_ASSIGNMENT_INDEPENDENT: 11230 default: 11231 { 11232 drflac_read_pcm_frames_f32__decode_independent_stereo(pFlac, frameCountThisIteration, unusedBitsPerSample, pDecodedSamples0, pDecodedSamples1, pBufferOut); 11233 } break; 11234 } 11235 } else { 11236 /* Generic interleaving. */ 11237 drflac_uint64 i; 11238 for (i = 0; i < frameCountThisIteration; ++i) { 11239 unsigned int j; 11240 for (j = 0; j < channelCount; ++j) { 11241 drflac_int32 sampleS32 = (drflac_int32)((drflac_uint32)(pFlac->currentFLACFrame.subframes[j].pSamplesS32[iFirstPCMFrame + i]) << (unusedBitsPerSample + pFlac->currentFLACFrame.subframes[j].wastedBitsPerSample)); 11242 pBufferOut[(i*channelCount)+j] = (float)(sampleS32 / 2147483648.0); 11243 } 11244 } 11245 } 11246 11247 framesRead += frameCountThisIteration; 11248 pBufferOut += frameCountThisIteration * channelCount; 11249 framesToRead -= frameCountThisIteration; 11250 pFlac->currentPCMFrame += frameCountThisIteration; 11251 pFlac->currentFLACFrame.pcmFramesRemaining -= (unsigned int)frameCountThisIteration; 11252 } 11253 } 11254 11255 return framesRead; 11256 } 11257 11258 11259 DRFLAC_API drflac_bool32 drflac_seek_to_pcm_frame(drflac* pFlac, drflac_uint64 pcmFrameIndex) 11260 { 11261 if (pFlac == NULL) { 11262 return DRFLAC_FALSE; 11263 } 11264 11265 /* Don't do anything if we're already on the seek point. */ 11266 if (pFlac->currentPCMFrame == pcmFrameIndex) { 11267 return DRFLAC_TRUE; 11268 } 11269 11270 /* 11271 If we don't know where the first frame begins then we can't seek. This will happen when the STREAMINFO block was not present 11272 when the decoder was opened. 11273 */ 11274 if (pFlac->firstFLACFramePosInBytes == 0) { 11275 return DRFLAC_FALSE; 11276 } 11277 11278 if (pcmFrameIndex == 0) { 11279 pFlac->currentPCMFrame = 0; 11280 return drflac__seek_to_first_frame(pFlac); 11281 } else { 11282 drflac_bool32 wasSuccessful = DRFLAC_FALSE; 11283 11284 /* Clamp the sample to the end. */ 11285 if (pcmFrameIndex > pFlac->totalPCMFrameCount) { 11286 pcmFrameIndex = pFlac->totalPCMFrameCount; 11287 } 11288 11289 /* If the target sample and the current sample are in the same frame we just move the position forward. */ 11290 if (pcmFrameIndex > pFlac->currentPCMFrame) { 11291 /* Forward. */ 11292 drflac_uint32 offset = (drflac_uint32)(pcmFrameIndex - pFlac->currentPCMFrame); 11293 if (pFlac->currentFLACFrame.pcmFramesRemaining > offset) { 11294 pFlac->currentFLACFrame.pcmFramesRemaining -= offset; 11295 pFlac->currentPCMFrame = pcmFrameIndex; 11296 return DRFLAC_TRUE; 11297 } 11298 } else { 11299 /* Backward. */ 11300 drflac_uint32 offsetAbs = (drflac_uint32)(pFlac->currentPCMFrame - pcmFrameIndex); 11301 drflac_uint32 currentFLACFramePCMFrameCount = pFlac->currentFLACFrame.header.blockSizeInPCMFrames; 11302 drflac_uint32 currentFLACFramePCMFramesConsumed = currentFLACFramePCMFrameCount - pFlac->currentFLACFrame.pcmFramesRemaining; 11303 if (currentFLACFramePCMFramesConsumed > offsetAbs) { 11304 pFlac->currentFLACFrame.pcmFramesRemaining += offsetAbs; 11305 pFlac->currentPCMFrame = pcmFrameIndex; 11306 return DRFLAC_TRUE; 11307 } 11308 } 11309 11310 /* 11311 Different techniques depending on encapsulation. Using the native FLAC seektable with Ogg encapsulation is a bit awkward so 11312 we'll instead use Ogg's natural seeking facility. 11313 */ 11314 #ifndef DR_FLAC_NO_OGG 11315 if (pFlac->container == drflac_container_ogg) 11316 { 11317 wasSuccessful = drflac_ogg__seek_to_pcm_frame(pFlac, pcmFrameIndex); 11318 } 11319 else 11320 #endif 11321 { 11322 /* First try seeking via the seek table. If this fails, fall back to a brute force seek which is much slower. */ 11323 if (!pFlac->_noSeekTableSeek) { 11324 wasSuccessful = drflac__seek_to_pcm_frame__seek_table(pFlac, pcmFrameIndex); 11325 } 11326 11327 #if !defined(DR_FLAC_NO_CRC) 11328 /* Fall back to binary search if seek table seeking fails. This requires the length of the stream to be known. */ 11329 if (!wasSuccessful && !pFlac->_noBinarySearchSeek && pFlac->totalPCMFrameCount > 0) { 11330 wasSuccessful = drflac__seek_to_pcm_frame__binary_search(pFlac, pcmFrameIndex); 11331 } 11332 #endif 11333 11334 /* Fall back to brute force if all else fails. */ 11335 if (!wasSuccessful && !pFlac->_noBruteForceSeek) { 11336 wasSuccessful = drflac__seek_to_pcm_frame__brute_force(pFlac, pcmFrameIndex); 11337 } 11338 } 11339 11340 pFlac->currentPCMFrame = pcmFrameIndex; 11341 return wasSuccessful; 11342 } 11343 } 11344 11345 11346 11347 /* High Level APIs */ 11348 11349 #if defined(SIZE_MAX) 11350 #define DRFLAC_SIZE_MAX SIZE_MAX 11351 #else 11352 #if defined(DRFLAC_64BIT) 11353 #define DRFLAC_SIZE_MAX ((drflac_uint64)0xFFFFFFFFFFFFFFFF) 11354 #else 11355 #define DRFLAC_SIZE_MAX 0xFFFFFFFF 11356 #endif 11357 #endif 11358 11359 11360 /* Using a macro as the definition of the drflac__full_decode_and_close_*() API family. Sue me. */ 11361 #define DRFLAC_DEFINE_FULL_READ_AND_CLOSE(extension, type) \ 11362 static type* drflac__full_read_and_close_ ## extension (drflac* pFlac, unsigned int* channelsOut, unsigned int* sampleRateOut, drflac_uint64* totalPCMFrameCountOut)\ 11363 { \ 11364 type* pSampleData = NULL; \ 11365 drflac_uint64 totalPCMFrameCount; \ 11366 \ 11367 DRFLAC_ASSERT(pFlac != NULL); \ 11368 \ 11369 totalPCMFrameCount = pFlac->totalPCMFrameCount; \ 11370 \ 11371 if (totalPCMFrameCount == 0) { \ 11372 type buffer[4096]; \ 11373 drflac_uint64 pcmFramesRead; \ 11374 size_t sampleDataBufferSize = sizeof(buffer); \ 11375 \ 11376 pSampleData = (type*)drflac__malloc_from_callbacks(sampleDataBufferSize, &pFlac->allocationCallbacks); \ 11377 if (pSampleData == NULL) { \ 11378 goto on_error; \ 11379 } \ 11380 \ 11381 while ((pcmFramesRead = (drflac_uint64)drflac_read_pcm_frames_##extension(pFlac, sizeof(buffer)/sizeof(buffer[0])/pFlac->channels, buffer)) > 0) { \ 11382 if (((totalPCMFrameCount + pcmFramesRead) * pFlac->channels * sizeof(type)) > sampleDataBufferSize) { \ 11383 type* pNewSampleData; \ 11384 size_t newSampleDataBufferSize; \ 11385 \ 11386 newSampleDataBufferSize = sampleDataBufferSize * 2; \ 11387 pNewSampleData = (type*)drflac__realloc_from_callbacks(pSampleData, newSampleDataBufferSize, sampleDataBufferSize, &pFlac->allocationCallbacks); \ 11388 if (pNewSampleData == NULL) { \ 11389 drflac__free_from_callbacks(pSampleData, &pFlac->allocationCallbacks); \ 11390 goto on_error; \ 11391 } \ 11392 \ 11393 sampleDataBufferSize = newSampleDataBufferSize; \ 11394 pSampleData = pNewSampleData; \ 11395 } \ 11396 \ 11397 DRFLAC_COPY_MEMORY(pSampleData + (totalPCMFrameCount*pFlac->channels), buffer, (size_t)(pcmFramesRead*pFlac->channels*sizeof(type))); \ 11398 totalPCMFrameCount += pcmFramesRead; \ 11399 } \ 11400 \ 11401 /* At this point everything should be decoded, but we just want to fill the unused part buffer with silence - need to \ 11402 protect those ears from random noise! */ \ 11403 DRFLAC_ZERO_MEMORY(pSampleData + (totalPCMFrameCount*pFlac->channels), (size_t)(sampleDataBufferSize - totalPCMFrameCount*pFlac->channels*sizeof(type))); \ 11404 } else { \ 11405 drflac_uint64 dataSize = totalPCMFrameCount*pFlac->channels*sizeof(type); \ 11406 if (dataSize > DRFLAC_SIZE_MAX) { \ 11407 goto on_error; /* The decoded data is too big. */ \ 11408 } \ 11409 \ 11410 pSampleData = (type*)drflac__malloc_from_callbacks((size_t)dataSize, &pFlac->allocationCallbacks); /* <-- Safe cast as per the check above. */ \ 11411 if (pSampleData == NULL) { \ 11412 goto on_error; \ 11413 } \ 11414 \ 11415 totalPCMFrameCount = drflac_read_pcm_frames_##extension(pFlac, pFlac->totalPCMFrameCount, pSampleData); \ 11416 } \ 11417 \ 11418 if (sampleRateOut) *sampleRateOut = pFlac->sampleRate; \ 11419 if (channelsOut) *channelsOut = pFlac->channels; \ 11420 if (totalPCMFrameCountOut) *totalPCMFrameCountOut = totalPCMFrameCount; \ 11421 \ 11422 drflac_close(pFlac); \ 11423 return pSampleData; \ 11424 \ 11425 on_error: \ 11426 drflac_close(pFlac); \ 11427 return NULL; \ 11428 } 11429 11430 DRFLAC_DEFINE_FULL_READ_AND_CLOSE(s32, drflac_int32) 11431 DRFLAC_DEFINE_FULL_READ_AND_CLOSE(s16, drflac_int16) 11432 DRFLAC_DEFINE_FULL_READ_AND_CLOSE(f32, float) 11433 11434 DRFLAC_API drflac_int32* drflac_open_and_read_pcm_frames_s32(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channelsOut, unsigned int* sampleRateOut, drflac_uint64* totalPCMFrameCountOut, const drflac_allocation_callbacks* pAllocationCallbacks) 11435 { 11436 drflac* pFlac; 11437 11438 if (channelsOut) { 11439 *channelsOut = 0; 11440 } 11441 if (sampleRateOut) { 11442 *sampleRateOut = 0; 11443 } 11444 if (totalPCMFrameCountOut) { 11445 *totalPCMFrameCountOut = 0; 11446 } 11447 11448 pFlac = drflac_open(onRead, onSeek, pUserData, pAllocationCallbacks); 11449 if (pFlac == NULL) { 11450 return NULL; 11451 } 11452 11453 return drflac__full_read_and_close_s32(pFlac, channelsOut, sampleRateOut, totalPCMFrameCountOut); 11454 } 11455 11456 DRFLAC_API drflac_int16* drflac_open_and_read_pcm_frames_s16(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channelsOut, unsigned int* sampleRateOut, drflac_uint64* totalPCMFrameCountOut, const drflac_allocation_callbacks* pAllocationCallbacks) 11457 { 11458 drflac* pFlac; 11459 11460 if (channelsOut) { 11461 *channelsOut = 0; 11462 } 11463 if (sampleRateOut) { 11464 *sampleRateOut = 0; 11465 } 11466 if (totalPCMFrameCountOut) { 11467 *totalPCMFrameCountOut = 0; 11468 } 11469 11470 pFlac = drflac_open(onRead, onSeek, pUserData, pAllocationCallbacks); 11471 if (pFlac == NULL) { 11472 return NULL; 11473 } 11474 11475 return drflac__full_read_and_close_s16(pFlac, channelsOut, sampleRateOut, totalPCMFrameCountOut); 11476 } 11477 11478 DRFLAC_API float* drflac_open_and_read_pcm_frames_f32(drflac_read_proc onRead, drflac_seek_proc onSeek, void* pUserData, unsigned int* channelsOut, unsigned int* sampleRateOut, drflac_uint64* totalPCMFrameCountOut, const drflac_allocation_callbacks* pAllocationCallbacks) 11479 { 11480 drflac* pFlac; 11481 11482 if (channelsOut) { 11483 *channelsOut = 0; 11484 } 11485 if (sampleRateOut) { 11486 *sampleRateOut = 0; 11487 } 11488 if (totalPCMFrameCountOut) { 11489 *totalPCMFrameCountOut = 0; 11490 } 11491 11492 pFlac = drflac_open(onRead, onSeek, pUserData, pAllocationCallbacks); 11493 if (pFlac == NULL) { 11494 return NULL; 11495 } 11496 11497 return drflac__full_read_and_close_f32(pFlac, channelsOut, sampleRateOut, totalPCMFrameCountOut); 11498 } 11499 11500 #ifndef DR_FLAC_NO_STDIO 11501 DRFLAC_API drflac_int32* drflac_open_file_and_read_pcm_frames_s32(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks) 11502 { 11503 drflac* pFlac; 11504 11505 if (sampleRate) { 11506 *sampleRate = 0; 11507 } 11508 if (channels) { 11509 *channels = 0; 11510 } 11511 if (totalPCMFrameCount) { 11512 *totalPCMFrameCount = 0; 11513 } 11514 11515 pFlac = drflac_open_file(filename, pAllocationCallbacks); 11516 if (pFlac == NULL) { 11517 return NULL; 11518 } 11519 11520 return drflac__full_read_and_close_s32(pFlac, channels, sampleRate, totalPCMFrameCount); 11521 } 11522 11523 DRFLAC_API drflac_int16* drflac_open_file_and_read_pcm_frames_s16(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks) 11524 { 11525 drflac* pFlac; 11526 11527 if (sampleRate) { 11528 *sampleRate = 0; 11529 } 11530 if (channels) { 11531 *channels = 0; 11532 } 11533 if (totalPCMFrameCount) { 11534 *totalPCMFrameCount = 0; 11535 } 11536 11537 pFlac = drflac_open_file(filename, pAllocationCallbacks); 11538 if (pFlac == NULL) { 11539 return NULL; 11540 } 11541 11542 return drflac__full_read_and_close_s16(pFlac, channels, sampleRate, totalPCMFrameCount); 11543 } 11544 11545 DRFLAC_API float* drflac_open_file_and_read_pcm_frames_f32(const char* filename, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks) 11546 { 11547 drflac* pFlac; 11548 11549 if (sampleRate) { 11550 *sampleRate = 0; 11551 } 11552 if (channels) { 11553 *channels = 0; 11554 } 11555 if (totalPCMFrameCount) { 11556 *totalPCMFrameCount = 0; 11557 } 11558 11559 pFlac = drflac_open_file(filename, pAllocationCallbacks); 11560 if (pFlac == NULL) { 11561 return NULL; 11562 } 11563 11564 return drflac__full_read_and_close_f32(pFlac, channels, sampleRate, totalPCMFrameCount); 11565 } 11566 #endif 11567 11568 DRFLAC_API drflac_int32* drflac_open_memory_and_read_pcm_frames_s32(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks) 11569 { 11570 drflac* pFlac; 11571 11572 if (sampleRate) { 11573 *sampleRate = 0; 11574 } 11575 if (channels) { 11576 *channels = 0; 11577 } 11578 if (totalPCMFrameCount) { 11579 *totalPCMFrameCount = 0; 11580 } 11581 11582 pFlac = drflac_open_memory(data, dataSize, pAllocationCallbacks); 11583 if (pFlac == NULL) { 11584 return NULL; 11585 } 11586 11587 return drflac__full_read_and_close_s32(pFlac, channels, sampleRate, totalPCMFrameCount); 11588 } 11589 11590 DRFLAC_API drflac_int16* drflac_open_memory_and_read_pcm_frames_s16(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks) 11591 { 11592 drflac* pFlac; 11593 11594 if (sampleRate) { 11595 *sampleRate = 0; 11596 } 11597 if (channels) { 11598 *channels = 0; 11599 } 11600 if (totalPCMFrameCount) { 11601 *totalPCMFrameCount = 0; 11602 } 11603 11604 pFlac = drflac_open_memory(data, dataSize, pAllocationCallbacks); 11605 if (pFlac == NULL) { 11606 return NULL; 11607 } 11608 11609 return drflac__full_read_and_close_s16(pFlac, channels, sampleRate, totalPCMFrameCount); 11610 } 11611 11612 DRFLAC_API float* drflac_open_memory_and_read_pcm_frames_f32(const void* data, size_t dataSize, unsigned int* channels, unsigned int* sampleRate, drflac_uint64* totalPCMFrameCount, const drflac_allocation_callbacks* pAllocationCallbacks) 11613 { 11614 drflac* pFlac; 11615 11616 if (sampleRate) { 11617 *sampleRate = 0; 11618 } 11619 if (channels) { 11620 *channels = 0; 11621 } 11622 if (totalPCMFrameCount) { 11623 *totalPCMFrameCount = 0; 11624 } 11625 11626 pFlac = drflac_open_memory(data, dataSize, pAllocationCallbacks); 11627 if (pFlac == NULL) { 11628 return NULL; 11629 } 11630 11631 return drflac__full_read_and_close_f32(pFlac, channels, sampleRate, totalPCMFrameCount); 11632 } 11633 11634 11635 DRFLAC_API void drflac_free(void* p, const drflac_allocation_callbacks* pAllocationCallbacks) 11636 { 11637 if (pAllocationCallbacks != NULL) { 11638 drflac__free_from_callbacks(p, pAllocationCallbacks); 11639 } else { 11640 drflac__free_default(p, NULL); 11641 } 11642 } 11643 11644 11645 11646 11647 DRFLAC_API void drflac_init_vorbis_comment_iterator(drflac_vorbis_comment_iterator* pIter, drflac_uint32 commentCount, const void* pComments) 11648 { 11649 if (pIter == NULL) { 11650 return; 11651 } 11652 11653 pIter->countRemaining = commentCount; 11654 pIter->pRunningData = (const char*)pComments; 11655 } 11656 11657 DRFLAC_API const char* drflac_next_vorbis_comment(drflac_vorbis_comment_iterator* pIter, drflac_uint32* pCommentLengthOut) 11658 { 11659 drflac_int32 length; 11660 const char* pComment; 11661 11662 /* Safety. */ 11663 if (pCommentLengthOut) { 11664 *pCommentLengthOut = 0; 11665 } 11666 11667 if (pIter == NULL || pIter->countRemaining == 0 || pIter->pRunningData == NULL) { 11668 return NULL; 11669 } 11670 11671 length = drflac__le2host_32(*(const drflac_uint32*)pIter->pRunningData); 11672 pIter->pRunningData += 4; 11673 11674 pComment = pIter->pRunningData; 11675 pIter->pRunningData += length; 11676 pIter->countRemaining -= 1; 11677 11678 if (pCommentLengthOut) { 11679 *pCommentLengthOut = length; 11680 } 11681 11682 return pComment; 11683 } 11684 11685 11686 11687 11688 DRFLAC_API void drflac_init_cuesheet_track_iterator(drflac_cuesheet_track_iterator* pIter, drflac_uint32 trackCount, const void* pTrackData) 11689 { 11690 if (pIter == NULL) { 11691 return; 11692 } 11693 11694 pIter->countRemaining = trackCount; 11695 pIter->pRunningData = (const char*)pTrackData; 11696 } 11697 11698 DRFLAC_API drflac_bool32 drflac_next_cuesheet_track(drflac_cuesheet_track_iterator* pIter, drflac_cuesheet_track* pCuesheetTrack) 11699 { 11700 drflac_cuesheet_track cuesheetTrack; 11701 const char* pRunningData; 11702 drflac_uint64 offsetHi; 11703 drflac_uint64 offsetLo; 11704 11705 if (pIter == NULL || pIter->countRemaining == 0 || pIter->pRunningData == NULL) { 11706 return DRFLAC_FALSE; 11707 } 11708 11709 pRunningData = pIter->pRunningData; 11710 11711 offsetHi = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 11712 offsetLo = drflac__be2host_32(*(const drflac_uint32*)pRunningData); pRunningData += 4; 11713 cuesheetTrack.offset = offsetLo | (offsetHi << 32); 11714 cuesheetTrack.trackNumber = pRunningData[0]; pRunningData += 1; 11715 DRFLAC_COPY_MEMORY(cuesheetTrack.ISRC, pRunningData, sizeof(cuesheetTrack.ISRC)); pRunningData += 12; 11716 cuesheetTrack.isAudio = (pRunningData[0] & 0x80) != 0; 11717 cuesheetTrack.preEmphasis = (pRunningData[0] & 0x40) != 0; pRunningData += 14; 11718 cuesheetTrack.indexCount = pRunningData[0]; pRunningData += 1; 11719 cuesheetTrack.pIndexPoints = (const drflac_cuesheet_track_index*)pRunningData; pRunningData += cuesheetTrack.indexCount * sizeof(drflac_cuesheet_track_index); 11720 11721 pIter->pRunningData = pRunningData; 11722 pIter->countRemaining -= 1; 11723 11724 if (pCuesheetTrack) { 11725 *pCuesheetTrack = cuesheetTrack; 11726 } 11727 11728 return DRFLAC_TRUE; 11729 } 11730 11731 #if defined(__GNUC__) 11732 #pragma GCC diagnostic pop 11733 #endif 11734 #endif /* DR_FLAC_IMPLEMENTATION */ 11735 11736 11737 /* 11738 REVISION HISTORY 11739 ================ 11740 v0.12.13 - 2020-05-16 11741 - Add compile-time and run-time version querying. 11742 - DRFLAC_VERSION_MINOR 11743 - DRFLAC_VERSION_MAJOR 11744 - DRFLAC_VERSION_REVISION 11745 - DRFLAC_VERSION_STRING 11746 - drflac_version() 11747 - drflac_version_string() 11748 11749 v0.12.12 - 2020-04-30 11750 - Fix compilation errors with VC6. 11751 11752 v0.12.11 - 2020-04-19 11753 - Fix some pedantic warnings. 11754 - Fix some undefined behaviour warnings. 11755 11756 v0.12.10 - 2020-04-10 11757 - Fix some bugs when trying to seek with an invalid seek table. 11758 11759 v0.12.9 - 2020-04-05 11760 - Fix warnings. 11761 11762 v0.12.8 - 2020-04-04 11763 - Add drflac_open_file_w() and drflac_open_file_with_metadata_w(). 11764 - Fix some static analysis warnings. 11765 - Minor documentation updates. 11766 11767 v0.12.7 - 2020-03-14 11768 - Fix compilation errors with VC6. 11769 11770 v0.12.6 - 2020-03-07 11771 - Fix compilation error with Visual Studio .NET 2003. 11772 11773 v0.12.5 - 2020-01-30 11774 - Silence some static analysis warnings. 11775 11776 v0.12.4 - 2020-01-29 11777 - Silence some static analysis warnings. 11778 11779 v0.12.3 - 2019-12-02 11780 - Fix some warnings when compiling with GCC and the -Og flag. 11781 - Fix a crash in out-of-memory situations. 11782 - Fix potential integer overflow bug. 11783 - Fix some static analysis warnings. 11784 - Fix a possible crash when using custom memory allocators without a custom realloc() implementation. 11785 - Fix a bug with binary search seeking where the bits per sample is not a multiple of 8. 11786 11787 v0.12.2 - 2019-10-07 11788 - Internal code clean up. 11789 11790 v0.12.1 - 2019-09-29 11791 - Fix some Clang Static Analyzer warnings. 11792 - Fix an unused variable warning. 11793 11794 v0.12.0 - 2019-09-23 11795 - API CHANGE: Add support for user defined memory allocation routines. This system allows the program to specify their own memory allocation 11796 routines with a user data pointer for client-specific contextual data. This adds an extra parameter to the end of the following APIs: 11797 - drflac_open() 11798 - drflac_open_relaxed() 11799 - drflac_open_with_metadata() 11800 - drflac_open_with_metadata_relaxed() 11801 - drflac_open_file() 11802 - drflac_open_file_with_metadata() 11803 - drflac_open_memory() 11804 - drflac_open_memory_with_metadata() 11805 - drflac_open_and_read_pcm_frames_s32() 11806 - drflac_open_and_read_pcm_frames_s16() 11807 - drflac_open_and_read_pcm_frames_f32() 11808 - drflac_open_file_and_read_pcm_frames_s32() 11809 - drflac_open_file_and_read_pcm_frames_s16() 11810 - drflac_open_file_and_read_pcm_frames_f32() 11811 - drflac_open_memory_and_read_pcm_frames_s32() 11812 - drflac_open_memory_and_read_pcm_frames_s16() 11813 - drflac_open_memory_and_read_pcm_frames_f32() 11814 Set this extra parameter to NULL to use defaults which is the same as the previous behaviour. Setting this NULL will use 11815 DRFLAC_MALLOC, DRFLAC_REALLOC and DRFLAC_FREE. 11816 - Remove deprecated APIs: 11817 - drflac_read_s32() 11818 - drflac_read_s16() 11819 - drflac_read_f32() 11820 - drflac_seek_to_sample() 11821 - drflac_open_and_decode_s32() 11822 - drflac_open_and_decode_s16() 11823 - drflac_open_and_decode_f32() 11824 - drflac_open_and_decode_file_s32() 11825 - drflac_open_and_decode_file_s16() 11826 - drflac_open_and_decode_file_f32() 11827 - drflac_open_and_decode_memory_s32() 11828 - drflac_open_and_decode_memory_s16() 11829 - drflac_open_and_decode_memory_f32() 11830 - Remove drflac.totalSampleCount which is now replaced with drflac.totalPCMFrameCount. You can emulate drflac.totalSampleCount 11831 by doing pFlac->totalPCMFrameCount*pFlac->channels. 11832 - Rename drflac.currentFrame to drflac.currentFLACFrame to remove ambiguity with PCM frames. 11833 - Fix errors when seeking to the end of a stream. 11834 - Optimizations to seeking. 11835 - SSE improvements and optimizations. 11836 - ARM NEON optimizations. 11837 - Optimizations to drflac_read_pcm_frames_s16(). 11838 - Optimizations to drflac_read_pcm_frames_s32(). 11839 11840 v0.11.10 - 2019-06-26 11841 - Fix a compiler error. 11842 11843 v0.11.9 - 2019-06-16 11844 - Silence some ThreadSanitizer warnings. 11845 11846 v0.11.8 - 2019-05-21 11847 - Fix warnings. 11848 11849 v0.11.7 - 2019-05-06 11850 - C89 fixes. 11851 11852 v0.11.6 - 2019-05-05 11853 - Add support for C89. 11854 - Fix a compiler warning when CRC is disabled. 11855 - Change license to choice of public domain or MIT-0. 11856 11857 v0.11.5 - 2019-04-19 11858 - Fix a compiler error with GCC. 11859 11860 v0.11.4 - 2019-04-17 11861 - Fix some warnings with GCC when compiling with -std=c99. 11862 11863 v0.11.3 - 2019-04-07 11864 - Silence warnings with GCC. 11865 11866 v0.11.2 - 2019-03-10 11867 - Fix a warning. 11868 11869 v0.11.1 - 2019-02-17 11870 - Fix a potential bug with seeking. 11871 11872 v0.11.0 - 2018-12-16 11873 - API CHANGE: Deprecated drflac_read_s32(), drflac_read_s16() and drflac_read_f32() and replaced them with 11874 drflac_read_pcm_frames_s32(), drflac_read_pcm_frames_s16() and drflac_read_pcm_frames_f32(). The new APIs take 11875 and return PCM frame counts instead of sample counts. To upgrade you will need to change the input count by 11876 dividing it by the channel count, and then do the same with the return value. 11877 - API_CHANGE: Deprecated drflac_seek_to_sample() and replaced with drflac_seek_to_pcm_frame(). Same rules as 11878 the changes to drflac_read_*() apply. 11879 - API CHANGE: Deprecated drflac_open_and_decode_*() and replaced with drflac_open_*_and_read_*(). Same rules as 11880 the changes to drflac_read_*() apply. 11881 - Optimizations. 11882 11883 v0.10.0 - 2018-09-11 11884 - Remove the DR_FLAC_NO_WIN32_IO option and the Win32 file IO functionality. If you need to use Win32 file IO you 11885 need to do it yourself via the callback API. 11886 - Fix the clang build. 11887 - Fix undefined behavior. 11888 - Fix errors with CUESHEET metdata blocks. 11889 - Add an API for iterating over each cuesheet track in the CUESHEET metadata block. This works the same way as the 11890 Vorbis comment API. 11891 - Other miscellaneous bug fixes, mostly relating to invalid FLAC streams. 11892 - Minor optimizations. 11893 11894 v0.9.11 - 2018-08-29 11895 - Fix a bug with sample reconstruction. 11896 11897 v0.9.10 - 2018-08-07 11898 - Improve 64-bit detection. 11899 11900 v0.9.9 - 2018-08-05 11901 - Fix C++ build on older versions of GCC. 11902 11903 v0.9.8 - 2018-07-24 11904 - Fix compilation errors. 11905 11906 v0.9.7 - 2018-07-05 11907 - Fix a warning. 11908 11909 v0.9.6 - 2018-06-29 11910 - Fix some typos. 11911 11912 v0.9.5 - 2018-06-23 11913 - Fix some warnings. 11914 11915 v0.9.4 - 2018-06-14 11916 - Optimizations to seeking. 11917 - Clean up. 11918 11919 v0.9.3 - 2018-05-22 11920 - Bug fix. 11921 11922 v0.9.2 - 2018-05-12 11923 - Fix a compilation error due to a missing break statement. 11924 11925 v0.9.1 - 2018-04-29 11926 - Fix compilation error with Clang. 11927 11928 v0.9 - 2018-04-24 11929 - Fix Clang build. 11930 - Start using major.minor.revision versioning. 11931 11932 v0.8g - 2018-04-19 11933 - Fix build on non-x86/x64 architectures. 11934 11935 v0.8f - 2018-02-02 11936 - Stop pretending to support changing rate/channels mid stream. 11937 11938 v0.8e - 2018-02-01 11939 - Fix a crash when the block size of a frame is larger than the maximum block size defined by the FLAC stream. 11940 - Fix a crash the the Rice partition order is invalid. 11941 11942 v0.8d - 2017-09-22 11943 - Add support for decoding streams with ID3 tags. ID3 tags are just skipped. 11944 11945 v0.8c - 2017-09-07 11946 - Fix warning on non-x86/x64 architectures. 11947 11948 v0.8b - 2017-08-19 11949 - Fix build on non-x86/x64 architectures. 11950 11951 v0.8a - 2017-08-13 11952 - A small optimization for the Clang build. 11953 11954 v0.8 - 2017-08-12 11955 - API CHANGE: Rename dr_* types to drflac_*. 11956 - Optimizations. This brings dr_flac back to about the same class of efficiency as the reference implementation. 11957 - Add support for custom implementations of malloc(), realloc(), etc. 11958 - Add CRC checking to Ogg encapsulated streams. 11959 - Fix VC++ 6 build. This is only for the C++ compiler. The C compiler is not currently supported. 11960 - Bug fixes. 11961 11962 v0.7 - 2017-07-23 11963 - Add support for opening a stream without a header block. To do this, use drflac_open_relaxed() / drflac_open_with_metadata_relaxed(). 11964 11965 v0.6 - 2017-07-22 11966 - Add support for recovering from invalid frames. With this change, dr_flac will simply skip over invalid frames as if they 11967 never existed. Frames are checked against their sync code, the CRC-8 of the frame header and the CRC-16 of the whole frame. 11968 11969 v0.5 - 2017-07-16 11970 - Fix typos. 11971 - Change drflac_bool* types to unsigned. 11972 - Add CRC checking. This makes dr_flac slower, but can be disabled with #define DR_FLAC_NO_CRC. 11973 11974 v0.4f - 2017-03-10 11975 - Fix a couple of bugs with the bitstreaming code. 11976 11977 v0.4e - 2017-02-17 11978 - Fix some warnings. 11979 11980 v0.4d - 2016-12-26 11981 - Add support for 32-bit floating-point PCM decoding. 11982 - Use drflac_int* and drflac_uint* sized types to improve compiler support. 11983 - Minor improvements to documentation. 11984 11985 v0.4c - 2016-12-26 11986 - Add support for signed 16-bit integer PCM decoding. 11987 11988 v0.4b - 2016-10-23 11989 - A minor change to drflac_bool8 and drflac_bool32 types. 11990 11991 v0.4a - 2016-10-11 11992 - Rename drBool32 to drflac_bool32 for styling consistency. 11993 11994 v0.4 - 2016-09-29 11995 - API/ABI CHANGE: Use fixed size 32-bit booleans instead of the built-in bool type. 11996 - API CHANGE: Rename drflac_open_and_decode*() to drflac_open_and_decode*_s32(). 11997 - API CHANGE: Swap the order of "channels" and "sampleRate" parameters in drflac_open_and_decode*(). Rationale for this is to 11998 keep it consistent with drflac_audio. 11999 12000 v0.3f - 2016-09-21 12001 - Fix a warning with GCC. 12002 12003 v0.3e - 2016-09-18 12004 - Fixed a bug where GCC 4.3+ was not getting properly identified. 12005 - Fixed a few typos. 12006 - Changed date formats to ISO 8601 (YYYY-MM-DD). 12007 12008 v0.3d - 2016-06-11 12009 - Minor clean up. 12010 12011 v0.3c - 2016-05-28 12012 - Fixed compilation error. 12013 12014 v0.3b - 2016-05-16 12015 - Fixed Linux/GCC build. 12016 - Updated documentation. 12017 12018 v0.3a - 2016-05-15 12019 - Minor fixes to documentation. 12020 12021 v0.3 - 2016-05-11 12022 - Optimizations. Now at about parity with the reference implementation on 32-bit builds. 12023 - Lots of clean up. 12024 12025 v0.2b - 2016-05-10 12026 - Bug fixes. 12027 12028 v0.2a - 2016-05-10 12029 - Made drflac_open_and_decode() more robust. 12030 - Removed an unused debugging variable 12031 12032 v0.2 - 2016-05-09 12033 - Added support for Ogg encapsulation. 12034 - API CHANGE. Have the onSeek callback take a third argument which specifies whether or not the seek 12035 should be relative to the start or the current position. Also changes the seeking rules such that 12036 seeking offsets will never be negative. 12037 - Have drflac_open_and_decode() fail gracefully if the stream has an unknown total sample count. 12038 12039 v0.1b - 2016-05-07 12040 - Properly close the file handle in drflac_open_file() and family when the decoder fails to initialize. 12041 - Removed a stale comment. 12042 12043 v0.1a - 2016-05-05 12044 - Minor formatting changes. 12045 - Fixed a warning on the GCC build. 12046 12047 v0.1 - 2016-05-03 12048 - Initial versioned release. 12049 */ 12050 12051 /* 12052 This software is available as a choice of the following licenses. Choose 12053 whichever you prefer. 12054 12055 =============================================================================== 12056 ALTERNATIVE 1 - Public Domain (www.unlicense.org) 12057 =============================================================================== 12058 This is free and unencumbered software released into the public domain. 12059 12060 Anyone is free to copy, modify, publish, use, compile, sell, or distribute this 12061 software, either in source code form or as a compiled binary, for any purpose, 12062 commercial or non-commercial, and by any means. 12063 12064 In jurisdictions that recognize copyright laws, the author or authors of this 12065 software dedicate any and all copyright interest in the software to the public 12066 domain. We make this dedication for the benefit of the public at large and to 12067 the detriment of our heirs and successors. We intend this dedication to be an 12068 overt act of relinquishment in perpetuity of all present and future rights to 12069 this software under copyright law. 12070 12071 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 12072 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 12073 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 12074 AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 12075 ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION 12076 WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 12077 12078 For more information, please refer to <http://unlicense.org/> 12079 12080 =============================================================================== 12081 ALTERNATIVE 2 - MIT No Attribution 12082 =============================================================================== 12083 Copyright 2020 David Reid 12084 12085 Permission is hereby granted, free of charge, to any person obtaining a copy of 12086 this software and associated documentation files (the "Software"), to deal in 12087 the Software without restriction, including without limitation the rights to 12088 use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 12089 of the Software, and to permit persons to whom the Software is furnished to do 12090 so. 12091 12092 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 12093 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 12094 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 12095 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 12096 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 12097 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 12098 SOFTWARE. 12099 */