1 /* stb_image - v2.28 - public domain image loader - http://nothings.org/stb
2 no warranty implied; use at your own risk
5 #define STB_IMAGE_IMPLEMENTATION
6 before you include this file in *one* C or C++ file to create the implementation.
8 // i.e. it should look like this:
12 #define STB_IMAGE_IMPLEMENTATION
13 #include "stb_image.h"
15 You can #define STBI_ASSERT(x) before the #include to avoid using assert.h.
16 And #define STBI_MALLOC, STBI_REALLOC, and STBI_FREE to avoid using malloc,realloc,free
20 Primarily of interest to game developers and other people who can
21 avoid problematic images and only need the trivial interface
23 JPEG baseline & progressive (12 bpc/arithmetic not supported, same as stock IJG lib)
24 PNG 1/2/4/8/16-bit-per-channel
26 TGA (not sure what subset, if a subset)
28 PSD (composited view only, no extra channels, 8/16 bit-per-channel)
30 GIF (*comp always reports as 4-channel)
31 HDR (radiance rgbE format)
33 PNM (PPM and PGM binary only)
35 Animated GIF still needs a proper API, but here's one way to do it:
36 http://gist.github.com/urraka/685d9a6340b26b830d49
38 - decode from memory or through FILE (define STBI_NO_STDIO to remove code)
39 - decode from arbitrary I/O callbacks
40 - SIMD acceleration on x86/x64 (SSE2) and ARM (NEON)
42 Full documentation under "DOCUMENTATION" below.
47 See end of file for license information.
49 RECENT REVISION HISTORY:
51 2.28 (2023-01-29) many error fixes, security errors, just tons of stuff
52 2.27 (2021-07-11) document stbi_info better, 16-bit PNM support, bug fixes
53 2.26 (2020-07-13) many minor fixes
54 2.25 (2020-02-02) fix warnings
55 2.24 (2020-02-02) fix warnings; thread-local failure_reason and flip_vertically
56 2.23 (2019-08-11) fix clang static analysis warning
57 2.22 (2019-03-04) gif fixes, fix warnings
58 2.21 (2019-02-25) fix typo in comment
59 2.20 (2019-02-07) support utf8 filenames in Windows; fix warnings and platform ifdefs
60 2.19 (2018-02-11) fix warning
61 2.18 (2018-01-30) fix warnings
62 2.17 (2018-01-29) bugfix, 1-bit BMP, 16-bitness query, fix warnings
63 2.16 (2017-07-23) all functions have 16-bit variants; optimizations; bugfixes
64 2.15 (2017-03-18) fix png-1,2,4; all Imagenet JPGs; no runtime SSE detection on GCC
65 2.14 (2017-03-03) remove deprecated STBI_JPEG_OLD; fixes for Imagenet JPGs
66 2.13 (2016-12-04) experimental 16-bit API, only for PNG so far; fixes
67 2.12 (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
68 2.11 (2016-04-02) 16-bit PNGS; enable SSE2 in non-gcc x64
69 RGB-format JPEG; remove white matting in PSD;
70 allocate large structures on the stack;
71 correct channel count for PNG & BMP
72 2.10 (2016-01-22) avoid warning introduced in 2.09
73 2.09 (2016-01-16) 16-bit TGA; comments in PNM files; STBI_REALLOC_SIZED
75 See end of file for full revision history.
78 ============================ Contributors =========================
80 Image formats Extensions, features
81 Sean Barrett (jpeg, png, bmp) Jetro Lauha (stbi_info)
82 Nicolas Schulz (hdr, psd) Martin "SpartanJ" Golini (stbi_info)
83 Jonathan Dummer (tga) James "moose2000" Brown (iPhone PNG)
84 Jean-Marc Lienher (gif) Ben "Disch" Wenger (io callbacks)
85 Tom Seddon (pic) Omar Cornut (1/2/4-bit PNG)
86 Thatcher Ulrich (psd) Nicolas Guillemot (vertical flip)
87 Ken Miller (pgm, ppm) Richard Mitton (16-bit PSD)
88 github:urraka (animated gif) Junggon Kim (PNM comments)
89 Christopher Forseth (animated gif) Daniel Gibson (16-bit TGA)
90 socks-the-fox (16-bit PNG)
91 Jeremy Sawicki (handle all ImageNet JPGs)
92 Optimizations & bugfixes Mikhail Morozov (1-bit BMP)
93 Fabian "ryg" Giesen Anael Seghezzi (is-16-bit query)
94 Arseny Kapoulkine Simon Breuss (16-bit PNM)
99 Marc LeBlanc David Woo Guillaume George Martins Mozeiko
100 Christpher Lloyd Jerry Jansson Joseph Thomson Blazej Dariusz Roszkowski
101 Phil Jordan Dave Moore Roy Eltham
102 Hayaki Saito Nathan Reed Won Chun
103 Luke Graham Johan Duparc Nick Verigakis the Horde3D community
104 Thomas Ruf Ronny Chevalier github:rlyeh
105 Janez Zemva John Bartholomew Michal Cichon github:romigrou
106 Jonathan Blow Ken Hamada Tero Hanninen github:svdijk
107 Eugene Golushkov Laurent Gomila Cort Stratton github:snagar
108 Aruelien Pocheville Sergio Gonzalez Thibault Reuille github:Zelex
109 Cass Everitt Ryamond Barbiero github:grim210
110 Paul Du Bois Engin Manap Aldo Culquicondor github:sammyhw
111 Philipp Wiesemann Dale Weiler Oriol Ferrer Mesia github:phprus
112 Josh Tobin Neil Bickford Matthew Gregan github:poppolopoppo
113 Julian Raschke Gregory Mullen Christian Floisand github:darealshinji
114 Baldur Karlsson Kevin Schmidt JR Smith github:Michaelangel007
115 Brad Weinberger Matvey Cherevko github:mosra
116 Luca Sas Alexander Veselov Zack Middleton [reserved]
117 Ryan C. Gordon [reserved] [reserved]
118 DO NOT ADD YOUR NAME HERE
122 To add your name to the credits, pick a random blank space in the middle and fill it.
123 80% of merge conflicts on stb PRs are due to people adding their name at the end
127 #ifndef STBI_INCLUDE_STB_IMAGE_H
128 #define STBI_INCLUDE_STB_IMAGE_H
133 // - no 12-bit-per-channel JPEG
134 // - no JPEGs with arithmetic coding
135 // - GIF always returns *comp=4
137 // Basic usage (see HDR discussion below for HDR usage):
139 // unsigned char *data = stbi_load(filename, &x, &y, &n, 0);
140 // // ... process data if not NULL ...
141 // // ... x = width, y = height, n = # 8-bit components per pixel ...
142 // // ... replace '0' with '1'..'4' to force that many components per pixel
143 // // ... but 'n' will always be the number that it would have been if you said 0
144 // stbi_image_free(data);
146 // Standard parameters:
147 // int *x -- outputs image width in pixels
148 // int *y -- outputs image height in pixels
149 // int *channels_in_file -- outputs # of image components in image file
150 // int desired_channels -- if non-zero, # of image components requested in result
152 // The return value from an image loader is an 'unsigned char *' which points
153 // to the pixel data, or NULL on an allocation failure or if the image is
154 // corrupt or invalid. The pixel data consists of *y scanlines of *x pixels,
155 // with each pixel consisting of N interleaved 8-bit components; the first
156 // pixel pointed to is top-left-most in the image. There is no padding between
157 // image scanlines or between pixels, regardless of format. The number of
158 // components N is 'desired_channels' if desired_channels is non-zero, or
159 // *channels_in_file otherwise. If desired_channels is non-zero,
160 // *channels_in_file has the number of components that _would_ have been
161 // output otherwise. E.g. if you set desired_channels to 4, you will always
162 // get RGBA output, but you can check *channels_in_file to see if it's trivially
163 // opaque because e.g. there were only 3 channels in the source image.
165 // An output image with N components has the following components interleaved
166 // in this order in each pixel:
168 // N=#comp components
171 // 3 red, green, blue
172 // 4 red, green, blue, alpha
174 // If image loading fails for any reason, the return value will be NULL,
175 // and *x, *y, *channels_in_file will be unchanged. The function
176 // stbi_failure_reason() can be queried for an extremely brief, end-user
177 // unfriendly explanation of why the load failed. Define STBI_NO_FAILURE_STRINGS
178 // to avoid compiling these strings at all, and STBI_FAILURE_USERMSG to get slightly
179 // more user-friendly ones.
181 // Paletted PNG, BMP, GIF, and PIC images are automatically depalettized.
183 // To query the width, height and component count of an image without having to
184 // decode the full file, you can use the stbi_info family of functions:
187 // ok = stbi_info(filename, &x, &y, &n);
188 // // returns ok=1 and sets x, y, n if image is a supported format,
191 // Note that stb_image pervasively uses ints in its public API for sizes,
192 // including sizes of memory buffers. This is now part of the API and thus
193 // hard to change without causing breakage. As a result, the various image
194 // loaders all have certain limits on image size; these differ somewhat
195 // by format but generally boil down to either just under 2GB or just under
196 // 1GB. When the decoded image would be larger than this, stb_image decoding
199 // Additionally, stb_image will reject image files that have any of their
200 // dimensions set to a larger value than the configurable STBI_MAX_DIMENSIONS,
201 // which defaults to 2**24 = 16777216 pixels. Due to the above memory limit,
202 // the only way to have an image with such dimensions load correctly
203 // is for it to have a rather extreme aspect ratio. Either way, the
204 // assumption here is that such larger images are likely to be malformed
205 // or malicious. If you do need to load an image with individual dimensions
206 // larger than that, and it still fits in the overall size limit, you can
207 // #define STBI_MAX_DIMENSIONS on your own to be something larger.
209 // ===========================================================================
213 // If compiling for Windows and you wish to use Unicode filenames, compile
215 // #define STBI_WINDOWS_UTF8
216 // and pass utf8-encoded filenames. Call stbi_convert_wchar_to_utf8 to convert
217 // Windows wchar_t filenames to utf8.
219 // ===========================================================================
223 // stb libraries are designed with the following priorities:
226 // 2. easy to maintain
227 // 3. good performance
229 // Sometimes I let "good performance" creep up in priority over "easy to maintain",
230 // and for best performance I may provide less-easy-to-use APIs that give higher
231 // performance, in addition to the easy-to-use ones. Nevertheless, it's important
232 // to keep in mind that from the standpoint of you, a client of this library,
233 // all you care about is #1 and #3, and stb libraries DO NOT emphasize #3 above all.
235 // Some secondary priorities arise directly from the first two, some of which
236 // provide more explicit reasons why performance can't be emphasized.
238 // - Portable ("ease of use")
239 // - Small source code footprint ("easy to maintain")
240 // - No dependencies ("ease of use")
242 // ===========================================================================
246 // I/O callbacks allow you to read from arbitrary sources, like packaged
247 // files or some other source. Data read from callbacks are processed
248 // through a small internal buffer (currently 128 bytes) to try to reduce
251 // The three functions you must define are "read" (reads some bytes of data),
252 // "skip" (skips some bytes of data), "eof" (reports if the stream is at the end).
254 // ===========================================================================
258 // The JPEG decoder will try to automatically use SIMD kernels on x86 when
259 // supported by the compiler. For ARM Neon support, you must explicitly
262 // (The old do-it-yourself SIMD API is no longer supported in the current
265 // On x86, SSE2 will automatically be used when available based on a run-time
266 // test; if not, the generic C versions are used as a fall-back. On ARM targets,
267 // the typical path is to have separate builds for NEON and non-NEON devices
268 // (at least this is true for iOS and Android). Therefore, the NEON support is
269 // toggled by a build flag: define STBI_NEON to get NEON loops.
271 // If for some reason you do not want to use any of SIMD code, or if
272 // you have issues compiling it, you can disable it entirely by
273 // defining STBI_NO_SIMD.
275 // ===========================================================================
277 // HDR image support (disable by defining STBI_NO_HDR)
279 // stb_image supports loading HDR images in general, and currently the Radiance
280 // .HDR file format specifically. You can still load any file through the existing
281 // interface; if you attempt to load an HDR file, it will be automatically remapped
282 // to LDR, assuming gamma 2.2 and an arbitrary scale factor defaulting to 1;
283 // both of these constants can be reconfigured through this interface:
285 // stbi_hdr_to_ldr_gamma(2.2f);
286 // stbi_hdr_to_ldr_scale(1.0f);
288 // (note, do not use _inverse_ constants; stbi_image will invert them
291 // Additionally, there is a new, parallel interface for loading files as
292 // (linear) floats to preserve the full dynamic range:
294 // float *data = stbi_loadf(filename, &x, &y, &n, 0);
296 // If you load LDR images through this interface, those images will
297 // be promoted to floating point values, run through the inverse of
298 // constants corresponding to the above:
300 // stbi_ldr_to_hdr_scale(1.0f);
301 // stbi_ldr_to_hdr_gamma(2.2f);
303 // Finally, given a filename (or an open file or memory block--see header
304 // file for details) containing image data, you can query for the "most
305 // appropriate" interface to use (that is, whether the image is HDR or
308 // stbi_is_hdr(char *filename);
310 // ===========================================================================
312 // iPhone PNG support:
314 // We optionally support converting iPhone-formatted PNGs (which store
315 // premultiplied BGRA) back to RGB, even though they're internally encoded
316 // differently. To enable this conversion, call
317 // stbi_convert_iphone_png_to_rgb(1).
319 // Call stbi_set_unpremultiply_on_load(1) as well to force a divide per
320 // pixel to remove any premultiplied alpha *only* if the image file explicitly
321 // says there's premultiplied data (currently only happens in iPhone images,
322 // and only if iPhone convert-to-rgb processing is on).
324 // ===========================================================================
326 // ADDITIONAL CONFIGURATION
328 // - You can suppress implementation of any of the decoders to reduce
329 // your code footprint by #defining one or more of the following
330 // symbols before creating the implementation.
340 // STBI_NO_PNM (.ppm and .pgm)
342 // - You can request *only* certain decoders and suppress all other ones
343 // (this will be more forward-compatible, as addition of new decoders
344 // doesn't require you to disable them explicitly):
354 // STBI_ONLY_PNM (.ppm and .pgm)
356 // - If you use STBI_NO_PNG (or _ONLY_ without PNG), and you still
357 // want the zlib decoder to be available, #define STBI_SUPPORT_ZLIB
359 // - If you define STBI_MAX_DIMENSIONS, stb_image will reject images greater
360 // than that size (in either width or height) without further processing.
361 // This is to let programs in the wild set an upper bound to prevent
362 // denial-of-service attacks on untrusted data, as one could generate a
363 // valid image of gigantic dimensions and force stb_image to allocate a
364 // huge block of memory and spend disproportionate time decoding it. By
365 // default this is set to (1 << 24), which is 16777216, but that's still
368 #ifndef STBI_NO_STDIO
370 #endif // STBI_NO_STDIO
372 #define STBI_VERSION 1
376 STBI_default = 0, // only used for desired_channels
385 typedef unsigned char stbi_uc;
386 typedef unsigned short stbi_us;
393 #ifdef STB_IMAGE_STATIC
394 #define STBIDEF static
396 #define STBIDEF extern
400 //////////////////////////////////////////////////////////////////////////////
402 // PRIMARY API - works on images of any type
406 // load image by filename, open file, or memory buffer
411 int (*read) (void *user,char *data,int size); // fill 'data' with 'size' bytes. return number of bytes actually read
412 void (*skip) (void *user,int n); // skip the next 'n' bytes, or 'unget' the last -n bytes if negative
413 int (*eof) (void *user); // returns nonzero if we are at end of file/data
416 ////////////////////////////////////
418 // 8-bits-per-channel interface
421 STBIDEF stbi_uc *stbi_load_from_memory (stbi_uc const *buffer, int len , int *x, int *y, int *channels_in_file, int desired_channels);
422 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk , void *user, int *x, int *y, int *channels_in_file, int desired_channels);
424 #ifndef STBI_NO_STDIO
425 STBIDEF stbi_uc *stbi_load (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels);
426 STBIDEF stbi_uc *stbi_load_from_file (FILE *f, int *x, int *y, int *channels_in_file, int desired_channels);
427 // for stbi_load_from_file, file pointer is left pointing immediately after image
431 STBIDEF stbi_uc *stbi_load_gif_from_memory(stbi_uc const *buffer, int len, int **delays, int *x, int *y, int *z, int *comp, int req_comp);
434 #ifdef STBI_WINDOWS_UTF8
435 STBIDEF int stbi_convert_wchar_to_utf8(char *buffer, size_t bufferlen, const wchar_t* input);
438 ////////////////////////////////////
440 // 16-bits-per-channel interface
443 STBIDEF stbi_us *stbi_load_16_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels);
444 STBIDEF stbi_us *stbi_load_16_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *channels_in_file, int desired_channels);
446 #ifndef STBI_NO_STDIO
447 STBIDEF stbi_us *stbi_load_16 (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels);
448 STBIDEF stbi_us *stbi_load_from_file_16(FILE *f, int *x, int *y, int *channels_in_file, int desired_channels);
451 ////////////////////////////////////
453 // float-per-channel interface
455 #ifndef STBI_NO_LINEAR
456 STBIDEF float *stbi_loadf_from_memory (stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels);
457 STBIDEF float *stbi_loadf_from_callbacks (stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *channels_in_file, int desired_channels);
459 #ifndef STBI_NO_STDIO
460 STBIDEF float *stbi_loadf (char const *filename, int *x, int *y, int *channels_in_file, int desired_channels);
461 STBIDEF float *stbi_loadf_from_file (FILE *f, int *x, int *y, int *channels_in_file, int desired_channels);
466 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma);
467 STBIDEF void stbi_hdr_to_ldr_scale(float scale);
468 #endif // STBI_NO_HDR
470 #ifndef STBI_NO_LINEAR
471 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma);
472 STBIDEF void stbi_ldr_to_hdr_scale(float scale);
473 #endif // STBI_NO_LINEAR
475 // stbi_is_hdr is always defined, but always returns false if STBI_NO_HDR
476 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user);
477 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len);
478 #ifndef STBI_NO_STDIO
479 STBIDEF int stbi_is_hdr (char const *filename);
480 STBIDEF int stbi_is_hdr_from_file(FILE *f);
481 #endif // STBI_NO_STDIO
484 // get a VERY brief reason for failure
485 // on most compilers (and ALL modern mainstream compilers) this is threadsafe
486 STBIDEF const char *stbi_failure_reason (void);
488 // free the loaded image -- this is just free()
489 STBIDEF void stbi_image_free (void *retval_from_stbi_load);
491 // get image dimensions & components without fully decoding
492 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp);
493 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp);
494 STBIDEF int stbi_is_16_bit_from_memory(stbi_uc const *buffer, int len);
495 STBIDEF int stbi_is_16_bit_from_callbacks(stbi_io_callbacks const *clbk, void *user);
497 #ifndef STBI_NO_STDIO
498 STBIDEF int stbi_info (char const *filename, int *x, int *y, int *comp);
499 STBIDEF int stbi_info_from_file (FILE *f, int *x, int *y, int *comp);
500 STBIDEF int stbi_is_16_bit (char const *filename);
501 STBIDEF int stbi_is_16_bit_from_file(FILE *f);
506 // for image formats that explicitly notate that they have premultiplied alpha,
507 // we just return the colors as stored in the file. set this flag to force
508 // unpremultiplication. results are undefined if the unpremultiply overflow.
509 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply);
511 // indicate whether we should process iphone images back to canonical format,
512 // or just pass them through "as-is"
513 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert);
515 // flip the image vertically, so the first pixel in the output array is the bottom left
516 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip);
518 // as above, but only applies to images loaded on the thread that calls the function
519 // this function is only available if your compiler supports thread-local variables;
520 // calling it will fail to link if your compiler doesn't
521 STBIDEF void stbi_set_unpremultiply_on_load_thread(int flag_true_if_should_unpremultiply);
522 STBIDEF void stbi_convert_iphone_png_to_rgb_thread(int flag_true_if_should_convert);
523 STBIDEF void stbi_set_flip_vertically_on_load_thread(int flag_true_if_should_flip);
525 // ZLIB client - used by PNG, available for other purposes
527 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen);
528 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header);
529 STBIDEF char *stbi_zlib_decode_malloc(const char *buffer, int len, int *outlen);
530 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
532 STBIDEF char *stbi_zlib_decode_noheader_malloc(const char *buffer, int len, int *outlen);
533 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen);
542 //// end header file /////////////////////////////////////////////////////
543 #endif // STBI_INCLUDE_STB_IMAGE_H
545 #ifdef STB_IMAGE_IMPLEMENTATION
547 #if defined(STBI_ONLY_JPEG) || defined(STBI_ONLY_PNG) || defined(STBI_ONLY_BMP) \
548 || defined(STBI_ONLY_TGA) || defined(STBI_ONLY_GIF) || defined(STBI_ONLY_PSD) \
549 || defined(STBI_ONLY_HDR) || defined(STBI_ONLY_PIC) || defined(STBI_ONLY_PNM) \
550 || defined(STBI_ONLY_ZLIB)
551 #ifndef STBI_ONLY_JPEG
554 #ifndef STBI_ONLY_PNG
557 #ifndef STBI_ONLY_BMP
560 #ifndef STBI_ONLY_PSD
563 #ifndef STBI_ONLY_TGA
566 #ifndef STBI_ONLY_GIF
569 #ifndef STBI_ONLY_HDR
572 #ifndef STBI_ONLY_PIC
575 #ifndef STBI_ONLY_PNM
580 #if defined(STBI_NO_PNG) && !defined(STBI_SUPPORT_ZLIB) && !defined(STBI_NO_ZLIB)
586 #include <stddef.h> // ptrdiff_t on osx
591 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR)
592 #include <math.h> // ldexp, pow
595 #ifndef STBI_NO_STDIO
601 #define STBI_ASSERT(x) assert(x)
605 #define STBI_EXTERN extern "C"
607 #define STBI_EXTERN extern
613 #define stbi_inline inline
618 #define stbi_inline __forceinline
621 #ifndef STBI_NO_THREAD_LOCALS
622 #if defined(__cplusplus) && __cplusplus >= 201103L
623 #define STBI_THREAD_LOCAL thread_local
624 #elif defined(__GNUC__) && __GNUC__ < 5
625 #define STBI_THREAD_LOCAL __thread
626 #elif defined(_MSC_VER)
627 #define STBI_THREAD_LOCAL __declspec(thread)
628 #elif defined (__STDC_VERSION__) && __STDC_VERSION__ >= 201112L && !defined(__STDC_NO_THREADS__)
629 #define STBI_THREAD_LOCAL _Thread_local
632 #ifndef STBI_THREAD_LOCAL
633 #if defined(__GNUC__)
634 #define STBI_THREAD_LOCAL __thread
639 #if defined(_MSC_VER) || defined(__SYMBIAN32__)
640 typedef unsigned short stbi__uint16;
641 typedef signed short stbi__int16;
642 typedef unsigned int stbi__uint32;
643 typedef signed int stbi__int32;
646 typedef uint16_t stbi__uint16;
647 typedef int16_t stbi__int16;
648 typedef uint32_t stbi__uint32;
649 typedef int32_t stbi__int32;
652 // should produce compiler error if size is wrong
653 typedef unsigned char validate_uint32[sizeof(stbi__uint32)==4 ? 1 : -1];
656 #define STBI_NOTUSED(v) (void)(v)
658 #define STBI_NOTUSED(v) (void)sizeof(v)
662 #define STBI_HAS_LROTL
665 #ifdef STBI_HAS_LROTL
666 #define stbi_lrot(x,y) _lrotl(x,y)
668 #define stbi_lrot(x,y) (((x) << (y)) | ((x) >> (-(y) & 31)))
671 #if defined(STBI_MALLOC) && defined(STBI_FREE) && (defined(STBI_REALLOC) || defined(STBI_REALLOC_SIZED))
673 #elif !defined(STBI_MALLOC) && !defined(STBI_FREE) && !defined(STBI_REALLOC) && !defined(STBI_REALLOC_SIZED)
676 #error "Must define all or none of STBI_MALLOC, STBI_FREE, and STBI_REALLOC (or STBI_REALLOC_SIZED)."
680 #define STBI_MALLOC(sz) malloc(sz)
681 #define STBI_REALLOC(p,newsz) realloc(p,newsz)
682 #define STBI_FREE(p) free(p)
685 #ifndef STBI_REALLOC_SIZED
686 #define STBI_REALLOC_SIZED(p,oldsz,newsz) STBI_REALLOC(p,newsz)
690 #if defined(__x86_64__) || defined(_M_X64)
691 #define STBI__X64_TARGET
692 #elif defined(__i386) || defined(_M_IX86)
693 #define STBI__X86_TARGET
696 #if defined(__GNUC__) && defined(STBI__X86_TARGET) && !defined(__SSE2__) && !defined(STBI_NO_SIMD)
697 // gcc doesn't support sse2 intrinsics unless you compile with -msse2,
698 // which in turn means it gets to use SSE2 everywhere. This is unfortunate,
699 // but previous attempts to provide the SSE2 functions with runtime
700 // detection caused numerous issues. The way architecture extensions are
701 // exposed in GCC/Clang is, sadly, not really suited for one-file libs.
702 // New behavior: if compiled with -msse2, we use SSE2 without any
703 // detection; if not, we don't use it at all.
707 #if defined(__MINGW32__) && defined(STBI__X86_TARGET) && !defined(STBI_MINGW_ENABLE_SSE2) && !defined(STBI_NO_SIMD)
708 // Note that __MINGW32__ doesn't actually mean 32-bit, so we have to avoid STBI__X64_TARGET
710 // 32-bit MinGW wants ESP to be 16-byte aligned, but this is not in the
711 // Windows ABI and VC++ as well as Windows DLLs don't maintain that invariant.
712 // As a result, enabling SSE2 on 32-bit MinGW is dangerous when not
713 // simultaneously enabling "-mstackrealign".
715 // See https://github.com/nothings/stb/issues/81 for more information.
717 // So default to no SSE2 on 32-bit MinGW. If you've read this far and added
718 // -mstackrealign to your build settings, feel free to #define STBI_MINGW_ENABLE_SSE2.
722 #if !defined(STBI_NO_SIMD) && (defined(STBI__X86_TARGET) || defined(STBI__X64_TARGET))
724 #include <emmintrin.h>
728 #if _MSC_VER >= 1400 // not VC6
729 #include <intrin.h> // __cpuid
730 static int stbi__cpuid3(void)
737 static int stbi__cpuid3(void)
749 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
751 #if !defined(STBI_NO_JPEG) && defined(STBI_SSE2)
752 static int stbi__sse2_available(void)
754 int info3 = stbi__cpuid3();
755 return ((info3 >> 26) & 1) != 0;
759 #else // assume GCC-style if not VC++
760 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
762 #if !defined(STBI_NO_JPEG) && defined(STBI_SSE2)
763 static int stbi__sse2_available(void)
765 // If we're even attempting to compile this on GCC/Clang, that means
766 // -msse2 is on, which means the compiler is allowed to use SSE2
767 // instructions at will, and so are we.
776 #if defined(STBI_NO_SIMD) && defined(STBI_NEON)
781 #include <arm_neon.h>
783 #define STBI_SIMD_ALIGN(type, name) __declspec(align(16)) type name
785 #define STBI_SIMD_ALIGN(type, name) type name __attribute__((aligned(16)))
789 #ifndef STBI_SIMD_ALIGN
790 #define STBI_SIMD_ALIGN(type, name) type name
793 #ifndef STBI_MAX_DIMENSIONS
794 #define STBI_MAX_DIMENSIONS (1 << 24)
797 ///////////////////////////////////////////////
799 // stbi__context struct and start_xxx functions
801 // stbi__context structure is our basic context used by all images, so it
802 // contains all the IO context, plus some basic image information
805 stbi__uint32 img_x, img_y;
806 int img_n, img_out_n;
808 stbi_io_callbacks io;
811 int read_from_callbacks;
813 stbi_uc buffer_start[128];
814 int callback_already_read;
816 stbi_uc *img_buffer, *img_buffer_end;
817 stbi_uc *img_buffer_original, *img_buffer_original_end;
821 static void stbi__refill_buffer(stbi__context *s);
823 // initialize a memory-decode context
824 static void stbi__start_mem(stbi__context *s, stbi_uc const *buffer, int len)
827 s->read_from_callbacks = 0;
828 s->callback_already_read = 0;
829 s->img_buffer = s->img_buffer_original = (stbi_uc *) buffer;
830 s->img_buffer_end = s->img_buffer_original_end = (stbi_uc *) buffer+len;
833 // initialize a callback-based context
834 static void stbi__start_callbacks(stbi__context *s, stbi_io_callbacks *c, void *user)
837 s->io_user_data = user;
838 s->buflen = sizeof(s->buffer_start);
839 s->read_from_callbacks = 1;
840 s->callback_already_read = 0;
841 s->img_buffer = s->img_buffer_original = s->buffer_start;
842 stbi__refill_buffer(s);
843 s->img_buffer_original_end = s->img_buffer_end;
846 #ifndef STBI_NO_STDIO
848 static int stbi__stdio_read(void *user, char *data, int size)
850 return (int) fread(data,1,size,(FILE*) user);
853 static void stbi__stdio_skip(void *user, int n)
856 fseek((FILE*) user, n, SEEK_CUR);
857 ch = fgetc((FILE*) user); /* have to read a byte to reset feof()'s flag */
859 ungetc(ch, (FILE *) user); /* push byte back onto stream if valid. */
863 static int stbi__stdio_eof(void *user)
865 return feof((FILE*) user) || ferror((FILE *) user);
868 static stbi_io_callbacks stbi__stdio_callbacks =
875 static void stbi__start_file(stbi__context *s, FILE *f)
877 stbi__start_callbacks(s, &stbi__stdio_callbacks, (void *) f);
880 //static void stop_file(stbi__context *s) { }
882 #endif // !STBI_NO_STDIO
884 static void stbi__rewind(stbi__context *s)
886 // conceptually rewind SHOULD rewind to the beginning of the stream,
887 // but we just rewind to the beginning of the initial buffer, because
888 // we only use it after doing 'test', which only ever looks at at most 92 bytes
889 s->img_buffer = s->img_buffer_original;
890 s->img_buffer_end = s->img_buffer_original_end;
901 int bits_per_channel;
907 static int stbi__jpeg_test(stbi__context *s);
908 static void *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
909 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp);
913 static int stbi__png_test(stbi__context *s);
914 static void *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
915 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp);
916 static int stbi__png_is16(stbi__context *s);
920 static int stbi__bmp_test(stbi__context *s);
921 static void *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
922 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp);
926 static int stbi__tga_test(stbi__context *s);
927 static void *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
928 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp);
932 static int stbi__psd_test(stbi__context *s);
933 static void *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc);
934 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp);
935 static int stbi__psd_is16(stbi__context *s);
939 static int stbi__hdr_test(stbi__context *s);
940 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
941 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp);
945 static int stbi__pic_test(stbi__context *s);
946 static void *stbi__pic_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
947 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp);
951 static int stbi__gif_test(stbi__context *s);
952 static void *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
953 static void *stbi__load_gif_main(stbi__context *s, int **delays, int *x, int *y, int *z, int *comp, int req_comp);
954 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp);
958 static int stbi__pnm_test(stbi__context *s);
959 static void *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri);
960 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp);
961 static int stbi__pnm_is16(stbi__context *s);
965 #ifdef STBI_THREAD_LOCAL
968 const char *stbi__g_failure_reason;
970 STBIDEF const char *stbi_failure_reason(void)
972 return stbi__g_failure_reason;
975 #ifndef STBI_NO_FAILURE_STRINGS
976 static int stbi__err(const char *str)
978 stbi__g_failure_reason = str;
983 static void *stbi__malloc(size_t size)
985 return STBI_MALLOC(size);
988 // stb_image uses ints pervasively, including for offset calculations.
989 // therefore the largest decoded image size we can support with the
990 // current code, even on 64-bit targets, is INT_MAX. this is not a
991 // significant limitation for the intended use case.
993 // we do, however, need to make sure our size calculations don't
994 // overflow. hence a few helper functions for size calculations that
995 // multiply integers together, making sure that they're non-negative
996 // and no overflow occurs.
998 // return 1 if the sum is valid, 0 on overflow.
999 // negative terms are considered invalid.
1000 static int stbi__addsizes_valid(int a, int b)
1002 if (b < 0) return 0;
1003 // now 0 <= b <= INT_MAX, hence also
1004 // 0 <= INT_MAX - b <= INTMAX.
1005 // And "a + b <= INT_MAX" (which might overflow) is the
1006 // same as a <= INT_MAX - b (no overflow)
1007 return a <= INT_MAX - b;
1010 // returns 1 if the product is valid, 0 on overflow.
1011 // negative factors are considered invalid.
1012 static int stbi__mul2sizes_valid(int a, int b)
1014 if (a < 0 || b < 0) return 0;
1015 if (b == 0) return 1; // mul-by-0 is always safe
1016 // portable way to check for no overflows in a*b
1017 return a <= INT_MAX/b;
1020 #if !defined(STBI_NO_JPEG) || !defined(STBI_NO_PNG) || !defined(STBI_NO_TGA) || !defined(STBI_NO_HDR)
1021 // returns 1 if "a*b + add" has no negative terms/factors and doesn't overflow
1022 static int stbi__mad2sizes_valid(int a, int b, int add)
1024 return stbi__mul2sizes_valid(a, b) && stbi__addsizes_valid(a*b, add);
1028 // returns 1 if "a*b*c + add" has no negative terms/factors and doesn't overflow
1029 static int stbi__mad3sizes_valid(int a, int b, int c, int add)
1031 return stbi__mul2sizes_valid(a, b) && stbi__mul2sizes_valid(a*b, c) &&
1032 stbi__addsizes_valid(a*b*c, add);
1035 // returns 1 if "a*b*c*d + add" has no negative terms/factors and doesn't overflow
1036 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR) || !defined(STBI_NO_PNM)
1037 static int stbi__mad4sizes_valid(int a, int b, int c, int d, int add)
1039 return stbi__mul2sizes_valid(a, b) && stbi__mul2sizes_valid(a*b, c) &&
1040 stbi__mul2sizes_valid(a*b*c, d) && stbi__addsizes_valid(a*b*c*d, add);
1044 #if !defined(STBI_NO_JPEG) || !defined(STBI_NO_PNG) || !defined(STBI_NO_TGA) || !defined(STBI_NO_HDR)
1045 // mallocs with size overflow checking
1046 static void *stbi__malloc_mad2(int a, int b, int add)
1048 if (!stbi__mad2sizes_valid(a, b, add)) return NULL;
1049 return stbi__malloc(a*b + add);
1053 static void *stbi__malloc_mad3(int a, int b, int c, int add)
1055 if (!stbi__mad3sizes_valid(a, b, c, add)) return NULL;
1056 return stbi__malloc(a*b*c + add);
1059 #if !defined(STBI_NO_LINEAR) || !defined(STBI_NO_HDR) || !defined(STBI_NO_PNM)
1060 static void *stbi__malloc_mad4(int a, int b, int c, int d, int add)
1062 if (!stbi__mad4sizes_valid(a, b, c, d, add)) return NULL;
1063 return stbi__malloc(a*b*c*d + add);
1067 // returns 1 if the sum of two signed ints is valid (between -2^31 and 2^31-1 inclusive), 0 on overflow.
1068 static int stbi__addints_valid(int a, int b)
1070 if ((a >= 0) != (b >= 0)) return 1; // a and b have different signs, so no overflow
1071 if (a < 0 && b < 0) return a >= INT_MIN - b; // same as a + b >= INT_MIN; INT_MIN - b cannot overflow since b < 0.
1072 return a <= INT_MAX - b;
1075 // returns 1 if the product of two signed shorts is valid, 0 on overflow.
1076 static int stbi__mul2shorts_valid(short a, short b)
1078 if (b == 0 || b == -1) return 1; // multiplication by 0 is always 0; check for -1 so SHRT_MIN/b doesn't overflow
1079 if ((a >= 0) == (b >= 0)) return a <= SHRT_MAX/b; // product is positive, so similar to mul2sizes_valid
1080 if (b < 0) return a <= SHRT_MIN / b; // same as a * b >= SHRT_MIN
1081 return a >= SHRT_MIN / b;
1084 // stbi__err - error
1085 // stbi__errpf - error returning pointer to float
1086 // stbi__errpuc - error returning pointer to unsigned char
1088 #ifdef STBI_NO_FAILURE_STRINGS
1089 #define stbi__err(x,y) 0
1090 #elif defined(STBI_FAILURE_USERMSG)
1091 #define stbi__err(x,y) stbi__err(y)
1093 #define stbi__err(x,y) stbi__err(x)
1096 #define stbi__errpf(x,y) ((float *)(size_t) (stbi__err(x,y)?NULL:NULL))
1097 #define stbi__errpuc(x,y) ((unsigned char *)(size_t) (stbi__err(x,y)?NULL:NULL))
1099 STBIDEF void stbi_image_free(void *retval_from_stbi_load)
1101 STBI_FREE(retval_from_stbi_load);
1104 #ifndef STBI_NO_LINEAR
1105 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp);
1109 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp);
1112 static int stbi__vertically_flip_on_load_global = 0;
1114 STBIDEF void stbi_set_flip_vertically_on_load(int flag_true_if_should_flip)
1116 stbi__vertically_flip_on_load_global = flag_true_if_should_flip;
1119 #ifndef STBI_THREAD_LOCAL
1120 #define stbi__vertically_flip_on_load stbi__vertically_flip_on_load_global
1122 static STBI_THREAD_LOCAL int stbi__vertically_flip_on_load_local, stbi__vertically_flip_on_load_set;
1124 STBIDEF void stbi_set_flip_vertically_on_load_thread(int flag_true_if_should_flip)
1126 stbi__vertically_flip_on_load_local = flag_true_if_should_flip;
1127 stbi__vertically_flip_on_load_set = 1;
1130 #define stbi__vertically_flip_on_load (stbi__vertically_flip_on_load_set \
1131 ? stbi__vertically_flip_on_load_local \
1132 : stbi__vertically_flip_on_load_global)
1133 #endif // STBI_THREAD_LOCAL
1135 static void *stbi__load_main(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc)
1137 memset(ri, 0, sizeof(*ri)); // make sure it's initialized if we add new fields
1138 ri->bits_per_channel = 8; // default is 8 so most paths don't have to be changed
1139 ri->channel_order = STBI_ORDER_RGB; // all current input & output are this, but this is here so we can add BGR order
1140 ri->num_channels = 0;
1142 // test the formats with a very explicit header first (at least a FOURCC
1143 // or distinctive magic number first)
1145 if (stbi__png_test(s)) return stbi__png_load(s,x,y,comp,req_comp, ri);
1148 if (stbi__bmp_test(s)) return stbi__bmp_load(s,x,y,comp,req_comp, ri);
1151 if (stbi__gif_test(s)) return stbi__gif_load(s,x,y,comp,req_comp, ri);
1154 if (stbi__psd_test(s)) return stbi__psd_load(s,x,y,comp,req_comp, ri, bpc);
1159 if (stbi__pic_test(s)) return stbi__pic_load(s,x,y,comp,req_comp, ri);
1162 // then the formats that can end up attempting to load with just 1 or 2
1163 // bytes matching expectations; these are prone to false positives, so
1165 #ifndef STBI_NO_JPEG
1166 if (stbi__jpeg_test(s)) return stbi__jpeg_load(s,x,y,comp,req_comp, ri);
1169 if (stbi__pnm_test(s)) return stbi__pnm_load(s,x,y,comp,req_comp, ri);
1173 if (stbi__hdr_test(s)) {
1174 float *hdr = stbi__hdr_load(s, x,y,comp,req_comp, ri);
1175 return stbi__hdr_to_ldr(hdr, *x, *y, req_comp ? req_comp : *comp);
1180 // test tga last because it's a crappy test!
1181 if (stbi__tga_test(s))
1182 return stbi__tga_load(s,x,y,comp,req_comp, ri);
1185 return stbi__errpuc("unknown image type", "Image not of any known type, or corrupt");
1188 static stbi_uc *stbi__convert_16_to_8(stbi__uint16 *orig, int w, int h, int channels)
1191 int img_len = w * h * channels;
1194 reduced = (stbi_uc *) stbi__malloc(img_len);
1195 if (reduced == NULL) return stbi__errpuc("outofmem", "Out of memory");
1197 for (i = 0; i < img_len; ++i)
1198 reduced[i] = (stbi_uc)((orig[i] >> 8) & 0xFF); // top half of each byte is sufficient approx of 16->8 bit scaling
1204 static stbi__uint16 *stbi__convert_8_to_16(stbi_uc *orig, int w, int h, int channels)
1207 int img_len = w * h * channels;
1208 stbi__uint16 *enlarged;
1210 enlarged = (stbi__uint16 *) stbi__malloc(img_len*2);
1211 if (enlarged == NULL) return (stbi__uint16 *) stbi__errpuc("outofmem", "Out of memory");
1213 for (i = 0; i < img_len; ++i)
1214 enlarged[i] = (stbi__uint16)((orig[i] << 8) + orig[i]); // replicate to high and low byte, maps 0->0, 255->0xffff
1220 static void stbi__vertical_flip(void *image, int w, int h, int bytes_per_pixel)
1223 size_t bytes_per_row = (size_t)w * bytes_per_pixel;
1225 stbi_uc *bytes = (stbi_uc *)image;
1227 for (row = 0; row < (h>>1); row++) {
1228 stbi_uc *row0 = bytes + row*bytes_per_row;
1229 stbi_uc *row1 = bytes + (h - row - 1)*bytes_per_row;
1230 // swap row0 with row1
1231 size_t bytes_left = bytes_per_row;
1232 while (bytes_left) {
1233 size_t bytes_copy = (bytes_left < sizeof(temp)) ? bytes_left : sizeof(temp);
1234 memcpy(temp, row0, bytes_copy);
1235 memcpy(row0, row1, bytes_copy);
1236 memcpy(row1, temp, bytes_copy);
1239 bytes_left -= bytes_copy;
1245 static void stbi__vertical_flip_slices(void *image, int w, int h, int z, int bytes_per_pixel)
1248 int slice_size = w * h * bytes_per_pixel;
1250 stbi_uc *bytes = (stbi_uc *)image;
1251 for (slice = 0; slice < z; ++slice) {
1252 stbi__vertical_flip(bytes, w, h, bytes_per_pixel);
1253 bytes += slice_size;
1258 static unsigned char *stbi__load_and_postprocess_8bit(stbi__context *s, int *x, int *y, int *comp, int req_comp)
1260 stbi__result_info ri;
1261 void *result = stbi__load_main(s, x, y, comp, req_comp, &ri, 8);
1266 // it is the responsibility of the loaders to make sure we get either 8 or 16 bit.
1267 STBI_ASSERT(ri.bits_per_channel == 8 || ri.bits_per_channel == 16);
1269 if (ri.bits_per_channel != 8) {
1270 result = stbi__convert_16_to_8((stbi__uint16 *) result, *x, *y, req_comp == 0 ? *comp : req_comp);
1271 ri.bits_per_channel = 8;
1274 // @TODO: move stbi__convert_format to here
1276 if (stbi__vertically_flip_on_load) {
1277 int channels = req_comp ? req_comp : *comp;
1278 stbi__vertical_flip(result, *x, *y, channels * sizeof(stbi_uc));
1281 return (unsigned char *) result;
1284 static stbi__uint16 *stbi__load_and_postprocess_16bit(stbi__context *s, int *x, int *y, int *comp, int req_comp)
1286 stbi__result_info ri;
1287 void *result = stbi__load_main(s, x, y, comp, req_comp, &ri, 16);
1292 // it is the responsibility of the loaders to make sure we get either 8 or 16 bit.
1293 STBI_ASSERT(ri.bits_per_channel == 8 || ri.bits_per_channel == 16);
1295 if (ri.bits_per_channel != 16) {
1296 result = stbi__convert_8_to_16((stbi_uc *) result, *x, *y, req_comp == 0 ? *comp : req_comp);
1297 ri.bits_per_channel = 16;
1300 // @TODO: move stbi__convert_format16 to here
1301 // @TODO: special case RGB-to-Y (and RGBA-to-YA) for 8-bit-to-16-bit case to keep more precision
1303 if (stbi__vertically_flip_on_load) {
1304 int channels = req_comp ? req_comp : *comp;
1305 stbi__vertical_flip(result, *x, *y, channels * sizeof(stbi__uint16));
1308 return (stbi__uint16 *) result;
1311 #if !defined(STBI_NO_HDR) && !defined(STBI_NO_LINEAR)
1312 static void stbi__float_postprocess(float *result, int *x, int *y, int *comp, int req_comp)
1314 if (stbi__vertically_flip_on_load && result != NULL) {
1315 int channels = req_comp ? req_comp : *comp;
1316 stbi__vertical_flip(result, *x, *y, channels * sizeof(float));
1321 #ifndef STBI_NO_STDIO
1323 #if defined(_WIN32) && defined(STBI_WINDOWS_UTF8)
1324 STBI_EXTERN __declspec(dllimport) int __stdcall MultiByteToWideChar(unsigned int cp, unsigned long flags, const char *str, int cbmb, wchar_t *widestr, int cchwide);
1325 STBI_EXTERN __declspec(dllimport) int __stdcall WideCharToMultiByte(unsigned int cp, unsigned long flags, const wchar_t *widestr, int cchwide, char *str, int cbmb, const char *defchar, int *used_default);
1328 #if defined(_WIN32) && defined(STBI_WINDOWS_UTF8)
1329 STBIDEF int stbi_convert_wchar_to_utf8(char *buffer, size_t bufferlen, const wchar_t* input)
1331 return WideCharToMultiByte(65001 /* UTF8 */, 0, input, -1, buffer, (int) bufferlen, NULL, NULL);
1335 static FILE *stbi__fopen(char const *filename, char const *mode)
1338 #if defined(_WIN32) && defined(STBI_WINDOWS_UTF8)
1340 wchar_t wFilename[1024];
1341 if (0 == MultiByteToWideChar(65001 /* UTF8 */, 0, filename, -1, wFilename, sizeof(wFilename)/sizeof(*wFilename)))
1344 if (0 == MultiByteToWideChar(65001 /* UTF8 */, 0, mode, -1, wMode, sizeof(wMode)/sizeof(*wMode)))
1347 #if defined(_MSC_VER) && _MSC_VER >= 1400
1348 if (0 != _wfopen_s(&f, wFilename, wMode))
1351 f = _wfopen(wFilename, wMode);
1354 #elif defined(_MSC_VER) && _MSC_VER >= 1400
1355 if (0 != fopen_s(&f, filename, mode))
1358 f = fopen(filename, mode);
1364 STBIDEF stbi_uc *stbi_load(char const *filename, int *x, int *y, int *comp, int req_comp)
1366 FILE *f = stbi__fopen(filename, "rb");
1367 unsigned char *result;
1368 if (!f) return stbi__errpuc("can't fopen", "Unable to open file");
1369 result = stbi_load_from_file(f,x,y,comp,req_comp);
1374 STBIDEF stbi_uc *stbi_load_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1376 unsigned char *result;
1378 stbi__start_file(&s,f);
1379 result = stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp);
1381 // need to 'unget' all the characters in the IO buffer
1382 fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
1387 STBIDEF stbi__uint16 *stbi_load_from_file_16(FILE *f, int *x, int *y, int *comp, int req_comp)
1389 stbi__uint16 *result;
1391 stbi__start_file(&s,f);
1392 result = stbi__load_and_postprocess_16bit(&s,x,y,comp,req_comp);
1394 // need to 'unget' all the characters in the IO buffer
1395 fseek(f, - (int) (s.img_buffer_end - s.img_buffer), SEEK_CUR);
1400 STBIDEF stbi_us *stbi_load_16(char const *filename, int *x, int *y, int *comp, int req_comp)
1402 FILE *f = stbi__fopen(filename, "rb");
1403 stbi__uint16 *result;
1404 if (!f) return (stbi_us *) stbi__errpuc("can't fopen", "Unable to open file");
1405 result = stbi_load_from_file_16(f,x,y,comp,req_comp);
1411 #endif //!STBI_NO_STDIO
1413 STBIDEF stbi_us *stbi_load_16_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *channels_in_file, int desired_channels)
1416 stbi__start_mem(&s,buffer,len);
1417 return stbi__load_and_postprocess_16bit(&s,x,y,channels_in_file,desired_channels);
1420 STBIDEF stbi_us *stbi_load_16_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *channels_in_file, int desired_channels)
1423 stbi__start_callbacks(&s, (stbi_io_callbacks *)clbk, user);
1424 return stbi__load_and_postprocess_16bit(&s,x,y,channels_in_file,desired_channels);
1427 STBIDEF stbi_uc *stbi_load_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1430 stbi__start_mem(&s,buffer,len);
1431 return stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp);
1434 STBIDEF stbi_uc *stbi_load_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1437 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1438 return stbi__load_and_postprocess_8bit(&s,x,y,comp,req_comp);
1442 STBIDEF stbi_uc *stbi_load_gif_from_memory(stbi_uc const *buffer, int len, int **delays, int *x, int *y, int *z, int *comp, int req_comp)
1444 unsigned char *result;
1446 stbi__start_mem(&s,buffer,len);
1448 result = (unsigned char*) stbi__load_gif_main(&s, delays, x, y, z, comp, req_comp);
1449 if (stbi__vertically_flip_on_load) {
1450 stbi__vertical_flip_slices( result, *x, *y, *z, *comp );
1457 #ifndef STBI_NO_LINEAR
1458 static float *stbi__loadf_main(stbi__context *s, int *x, int *y, int *comp, int req_comp)
1460 unsigned char *data;
1462 if (stbi__hdr_test(s)) {
1463 stbi__result_info ri;
1464 float *hdr_data = stbi__hdr_load(s,x,y,comp,req_comp, &ri);
1466 stbi__float_postprocess(hdr_data,x,y,comp,req_comp);
1470 data = stbi__load_and_postprocess_8bit(s, x, y, comp, req_comp);
1472 return stbi__ldr_to_hdr(data, *x, *y, req_comp ? req_comp : *comp);
1473 return stbi__errpf("unknown image type", "Image not of any known type, or corrupt");
1476 STBIDEF float *stbi_loadf_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp, int req_comp)
1479 stbi__start_mem(&s,buffer,len);
1480 return stbi__loadf_main(&s,x,y,comp,req_comp);
1483 STBIDEF float *stbi_loadf_from_callbacks(stbi_io_callbacks const *clbk, void *user, int *x, int *y, int *comp, int req_comp)
1486 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1487 return stbi__loadf_main(&s,x,y,comp,req_comp);
1490 #ifndef STBI_NO_STDIO
1491 STBIDEF float *stbi_loadf(char const *filename, int *x, int *y, int *comp, int req_comp)
1494 FILE *f = stbi__fopen(filename, "rb");
1495 if (!f) return stbi__errpf("can't fopen", "Unable to open file");
1496 result = stbi_loadf_from_file(f,x,y,comp,req_comp);
1501 STBIDEF float *stbi_loadf_from_file(FILE *f, int *x, int *y, int *comp, int req_comp)
1504 stbi__start_file(&s,f);
1505 return stbi__loadf_main(&s,x,y,comp,req_comp);
1507 #endif // !STBI_NO_STDIO
1509 #endif // !STBI_NO_LINEAR
1511 // these is-hdr-or-not is defined independent of whether STBI_NO_LINEAR is
1512 // defined, for API simplicity; if STBI_NO_LINEAR is defined, it always
1515 STBIDEF int stbi_is_hdr_from_memory(stbi_uc const *buffer, int len)
1519 stbi__start_mem(&s,buffer,len);
1520 return stbi__hdr_test(&s);
1522 STBI_NOTUSED(buffer);
1528 #ifndef STBI_NO_STDIO
1529 STBIDEF int stbi_is_hdr (char const *filename)
1531 FILE *f = stbi__fopen(filename, "rb");
1534 result = stbi_is_hdr_from_file(f);
1540 STBIDEF int stbi_is_hdr_from_file(FILE *f)
1543 long pos = ftell(f);
1546 stbi__start_file(&s,f);
1547 res = stbi__hdr_test(&s);
1548 fseek(f, pos, SEEK_SET);
1555 #endif // !STBI_NO_STDIO
1557 STBIDEF int stbi_is_hdr_from_callbacks(stbi_io_callbacks const *clbk, void *user)
1561 stbi__start_callbacks(&s, (stbi_io_callbacks *) clbk, user);
1562 return stbi__hdr_test(&s);
1570 #ifndef STBI_NO_LINEAR
1571 static float stbi__l2h_gamma=2.2f, stbi__l2h_scale=1.0f;
1573 STBIDEF void stbi_ldr_to_hdr_gamma(float gamma) { stbi__l2h_gamma = gamma; }
1574 STBIDEF void stbi_ldr_to_hdr_scale(float scale) { stbi__l2h_scale = scale; }
1577 static float stbi__h2l_gamma_i=1.0f/2.2f, stbi__h2l_scale_i=1.0f;
1579 STBIDEF void stbi_hdr_to_ldr_gamma(float gamma) { stbi__h2l_gamma_i = 1/gamma; }
1580 STBIDEF void stbi_hdr_to_ldr_scale(float scale) { stbi__h2l_scale_i = 1/scale; }
1583 //////////////////////////////////////////////////////////////////////////////
1585 // Common code used by all image loaders
1595 static void stbi__refill_buffer(stbi__context *s)
1597 int n = (s->io.read)(s->io_user_data,(char*)s->buffer_start,s->buflen);
1598 s->callback_already_read += (int) (s->img_buffer - s->img_buffer_original);
1600 // at end of file, treat same as if from memory, but need to handle case
1601 // where s->img_buffer isn't pointing to safe memory, e.g. 0-byte file
1602 s->read_from_callbacks = 0;
1603 s->img_buffer = s->buffer_start;
1604 s->img_buffer_end = s->buffer_start+1;
1607 s->img_buffer = s->buffer_start;
1608 s->img_buffer_end = s->buffer_start + n;
1612 stbi_inline static stbi_uc stbi__get8(stbi__context *s)
1614 if (s->img_buffer < s->img_buffer_end)
1615 return *s->img_buffer++;
1616 if (s->read_from_callbacks) {
1617 stbi__refill_buffer(s);
1618 return *s->img_buffer++;
1623 #if defined(STBI_NO_JPEG) && defined(STBI_NO_HDR) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM)
1626 stbi_inline static int stbi__at_eof(stbi__context *s)
1629 if (!(s->io.eof)(s->io_user_data)) return 0;
1630 // if feof() is true, check if buffer = end
1631 // special case: we've only got the special 0 character at the end
1632 if (s->read_from_callbacks == 0) return 1;
1635 return s->img_buffer >= s->img_buffer_end;
1639 #if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC)
1642 static void stbi__skip(stbi__context *s, int n)
1644 if (n == 0) return; // already there!
1646 s->img_buffer = s->img_buffer_end;
1650 int blen = (int) (s->img_buffer_end - s->img_buffer);
1652 s->img_buffer = s->img_buffer_end;
1653 (s->io.skip)(s->io_user_data, n - blen);
1661 #if defined(STBI_NO_PNG) && defined(STBI_NO_TGA) && defined(STBI_NO_HDR) && defined(STBI_NO_PNM)
1664 static int stbi__getn(stbi__context *s, stbi_uc *buffer, int n)
1667 int blen = (int) (s->img_buffer_end - s->img_buffer);
1671 memcpy(buffer, s->img_buffer, blen);
1673 count = (s->io.read)(s->io_user_data, (char*) buffer + blen, n - blen);
1674 res = (count == (n-blen));
1675 s->img_buffer = s->img_buffer_end;
1680 if (s->img_buffer+n <= s->img_buffer_end) {
1681 memcpy(buffer, s->img_buffer, n);
1689 #if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_PSD) && defined(STBI_NO_PIC)
1692 static int stbi__get16be(stbi__context *s)
1694 int z = stbi__get8(s);
1695 return (z << 8) + stbi__get8(s);
1699 #if defined(STBI_NO_PNG) && defined(STBI_NO_PSD) && defined(STBI_NO_PIC)
1702 static stbi__uint32 stbi__get32be(stbi__context *s)
1704 stbi__uint32 z = stbi__get16be(s);
1705 return (z << 16) + stbi__get16be(s);
1709 #if defined(STBI_NO_BMP) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF)
1712 static int stbi__get16le(stbi__context *s)
1714 int z = stbi__get8(s);
1715 return z + (stbi__get8(s) << 8);
1720 static stbi__uint32 stbi__get32le(stbi__context *s)
1722 stbi__uint32 z = stbi__get16le(s);
1723 z += (stbi__uint32)stbi__get16le(s) << 16;
1728 #define STBI__BYTECAST(x) ((stbi_uc) ((x) & 255)) // truncate int to byte without warnings
1730 #if defined(STBI_NO_JPEG) && defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM)
1733 //////////////////////////////////////////////////////////////////////////////
1735 // generic converter from built-in img_n to req_comp
1736 // individual types do this automatically as much as possible (e.g. jpeg
1737 // does all cases internally since it needs to colorspace convert anyway,
1738 // and it never has alpha, so very few cases ). png can automatically
1739 // interleave an alpha=255 channel, but falls back to this for other cases
1741 // assume data buffer is malloced, so malloc a new one and free that one
1742 // only failure mode is malloc failing
1744 static stbi_uc stbi__compute_y(int r, int g, int b)
1746 return (stbi_uc) (((r*77) + (g*150) + (29*b)) >> 8);
1750 #if defined(STBI_NO_PNG) && defined(STBI_NO_BMP) && defined(STBI_NO_PSD) && defined(STBI_NO_TGA) && defined(STBI_NO_GIF) && defined(STBI_NO_PIC) && defined(STBI_NO_PNM)
1753 static unsigned char *stbi__convert_format(unsigned char *data, int img_n, int req_comp, unsigned int x, unsigned int y)
1756 unsigned char *good;
1758 if (req_comp == img_n) return data;
1759 STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
1761 good = (unsigned char *) stbi__malloc_mad3(req_comp, x, y, 0);
1764 return stbi__errpuc("outofmem", "Out of memory");
1767 for (j=0; j < (int) y; ++j) {
1768 unsigned char *src = data + j * x * img_n ;
1769 unsigned char *dest = good + j * x * req_comp;
1771 #define STBI__COMBO(a,b) ((a)*8+(b))
1772 #define STBI__CASE(a,b) case STBI__COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1773 // convert source image with img_n components to one with req_comp components;
1774 // avoid switch per pixel, so use switch per scanline and massive macros
1775 switch (STBI__COMBO(img_n, req_comp)) {
1776 STBI__CASE(1,2) { dest[0]=src[0]; dest[1]=255; } break;
1777 STBI__CASE(1,3) { dest[0]=dest[1]=dest[2]=src[0]; } break;
1778 STBI__CASE(1,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=255; } break;
1779 STBI__CASE(2,1) { dest[0]=src[0]; } break;
1780 STBI__CASE(2,3) { dest[0]=dest[1]=dest[2]=src[0]; } break;
1781 STBI__CASE(2,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=src[1]; } break;
1782 STBI__CASE(3,4) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];dest[3]=255; } break;
1783 STBI__CASE(3,1) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); } break;
1784 STBI__CASE(3,2) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); dest[1] = 255; } break;
1785 STBI__CASE(4,1) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); } break;
1786 STBI__CASE(4,2) { dest[0]=stbi__compute_y(src[0],src[1],src[2]); dest[1] = src[3]; } break;
1787 STBI__CASE(4,3) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2]; } break;
1788 default: STBI_ASSERT(0); STBI_FREE(data); STBI_FREE(good); return stbi__errpuc("unsupported", "Unsupported format conversion");
1798 #if defined(STBI_NO_PNG) && defined(STBI_NO_PSD)
1801 static stbi__uint16 stbi__compute_y_16(int r, int g, int b)
1803 return (stbi__uint16) (((r*77) + (g*150) + (29*b)) >> 8);
1807 #if defined(STBI_NO_PNG) && defined(STBI_NO_PSD)
1810 static stbi__uint16 *stbi__convert_format16(stbi__uint16 *data, int img_n, int req_comp, unsigned int x, unsigned int y)
1815 if (req_comp == img_n) return data;
1816 STBI_ASSERT(req_comp >= 1 && req_comp <= 4);
1818 good = (stbi__uint16 *) stbi__malloc(req_comp * x * y * 2);
1821 return (stbi__uint16 *) stbi__errpuc("outofmem", "Out of memory");
1824 for (j=0; j < (int) y; ++j) {
1825 stbi__uint16 *src = data + j * x * img_n ;
1826 stbi__uint16 *dest = good + j * x * req_comp;
1828 #define STBI__COMBO(a,b) ((a)*8+(b))
1829 #define STBI__CASE(a,b) case STBI__COMBO(a,b): for(i=x-1; i >= 0; --i, src += a, dest += b)
1830 // convert source image with img_n components to one with req_comp components;
1831 // avoid switch per pixel, so use switch per scanline and massive macros
1832 switch (STBI__COMBO(img_n, req_comp)) {
1833 STBI__CASE(1,2) { dest[0]=src[0]; dest[1]=0xffff; } break;
1834 STBI__CASE(1,3) { dest[0]=dest[1]=dest[2]=src[0]; } break;
1835 STBI__CASE(1,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=0xffff; } break;
1836 STBI__CASE(2,1) { dest[0]=src[0]; } break;
1837 STBI__CASE(2,3) { dest[0]=dest[1]=dest[2]=src[0]; } break;
1838 STBI__CASE(2,4) { dest[0]=dest[1]=dest[2]=src[0]; dest[3]=src[1]; } break;
1839 STBI__CASE(3,4) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2];dest[3]=0xffff; } break;
1840 STBI__CASE(3,1) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); } break;
1841 STBI__CASE(3,2) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); dest[1] = 0xffff; } break;
1842 STBI__CASE(4,1) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); } break;
1843 STBI__CASE(4,2) { dest[0]=stbi__compute_y_16(src[0],src[1],src[2]); dest[1] = src[3]; } break;
1844 STBI__CASE(4,3) { dest[0]=src[0];dest[1]=src[1];dest[2]=src[2]; } break;
1845 default: STBI_ASSERT(0); STBI_FREE(data); STBI_FREE(good); return (stbi__uint16*) stbi__errpuc("unsupported", "Unsupported format conversion");
1855 #ifndef STBI_NO_LINEAR
1856 static float *stbi__ldr_to_hdr(stbi_uc *data, int x, int y, int comp)
1860 if (!data) return NULL;
1861 output = (float *) stbi__malloc_mad4(x, y, comp, sizeof(float), 0);
1862 if (output == NULL) { STBI_FREE(data); return stbi__errpf("outofmem", "Out of memory"); }
1863 // compute number of non-alpha components
1864 if (comp & 1) n = comp; else n = comp-1;
1865 for (i=0; i < x*y; ++i) {
1866 for (k=0; k < n; ++k) {
1867 output[i*comp + k] = (float) (pow(data[i*comp+k]/255.0f, stbi__l2h_gamma) * stbi__l2h_scale);
1871 for (i=0; i < x*y; ++i) {
1872 output[i*comp + n] = data[i*comp + n]/255.0f;
1881 #define stbi__float2int(x) ((int) (x))
1882 static stbi_uc *stbi__hdr_to_ldr(float *data, int x, int y, int comp)
1886 if (!data) return NULL;
1887 output = (stbi_uc *) stbi__malloc_mad3(x, y, comp, 0);
1888 if (output == NULL) { STBI_FREE(data); return stbi__errpuc("outofmem", "Out of memory"); }
1889 // compute number of non-alpha components
1890 if (comp & 1) n = comp; else n = comp-1;
1891 for (i=0; i < x*y; ++i) {
1892 for (k=0; k < n; ++k) {
1893 float z = (float) pow(data[i*comp+k]*stbi__h2l_scale_i, stbi__h2l_gamma_i) * 255 + 0.5f;
1895 if (z > 255) z = 255;
1896 output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1899 float z = data[i*comp+k] * 255 + 0.5f;
1901 if (z > 255) z = 255;
1902 output[i*comp + k] = (stbi_uc) stbi__float2int(z);
1910 //////////////////////////////////////////////////////////////////////////////
1912 // "baseline" JPEG/JFIF decoder
1914 // simple implementation
1915 // - doesn't support delayed output of y-dimension
1916 // - simple interface (only one output format: 8-bit interleaved RGB)
1917 // - doesn't try to recover corrupt jpegs
1918 // - doesn't allow partial loading, loading multiple at once
1919 // - still fast on x86 (copying globals into locals doesn't help x86)
1920 // - allocates lots of intermediate memory (full size of all components)
1921 // - non-interleaved case requires this anyway
1922 // - allows good upsampling (see next)
1924 // - upsampled channels are bilinearly interpolated, even across blocks
1925 // - quality integer IDCT derived from IJG's 'slow'
1927 // - fast huffman; reasonable integer IDCT
1928 // - some SIMD kernels for common paths on targets with SSE2/NEON
1929 // - uses a lot of intermediate memory, could cache poorly
1931 #ifndef STBI_NO_JPEG
1933 // huffman decoding acceleration
1934 #define FAST_BITS 9 // larger handles more cases; smaller stomps less cache
1938 stbi_uc fast[1 << FAST_BITS];
1939 // weirdly, repacking this into AoS is a 10% speed loss, instead of a win
1940 stbi__uint16 code[256];
1941 stbi_uc values[256];
1943 unsigned int maxcode[18];
1944 int delta[17]; // old 'firstsymbol' - old 'firstcode'
1950 stbi__huffman huff_dc[4];
1951 stbi__huffman huff_ac[4];
1952 stbi__uint16 dequant[4][64];
1953 stbi__int16 fast_ac[4][1 << FAST_BITS];
1955 // sizes for components, interleaved MCUs
1956 int img_h_max, img_v_max;
1957 int img_mcu_x, img_mcu_y;
1958 int img_mcu_w, img_mcu_h;
1960 // definition of jpeg image component
1971 void *raw_data, *raw_coeff;
1973 short *coeff; // progressive only
1974 int coeff_w, coeff_h; // number of 8x8 coefficient blocks
1977 stbi__uint32 code_buffer; // jpeg entropy-coded buffer
1978 int code_bits; // number of valid bits
1979 unsigned char marker; // marker seen while filling entropy buffer
1980 int nomore; // flag if we saw a marker so must stop
1989 int app14_color_transform; // Adobe APP14 tag
1992 int scan_n, order[4];
1993 int restart_interval, todo;
1996 void (*idct_block_kernel)(stbi_uc *out, int out_stride, short data[64]);
1997 void (*YCbCr_to_RGB_kernel)(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step);
1998 stbi_uc *(*resample_row_hv_2_kernel)(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs);
2001 static int stbi__build_huffman(stbi__huffman *h, int *count)
2005 // build size list for each symbol (from JPEG spec)
2006 for (i=0; i < 16; ++i) {
2007 for (j=0; j < count[i]; ++j) {
2008 h->size[k++] = (stbi_uc) (i+1);
2009 if(k >= 257) return stbi__err("bad size list","Corrupt JPEG");
2014 // compute actual symbols (from jpeg spec)
2017 for(j=1; j <= 16; ++j) {
2018 // compute delta to add to code to compute symbol id
2019 h->delta[j] = k - code;
2020 if (h->size[k] == j) {
2021 while (h->size[k] == j)
2022 h->code[k++] = (stbi__uint16) (code++);
2023 if (code-1 >= (1u << j)) return stbi__err("bad code lengths","Corrupt JPEG");
2025 // compute largest code + 1 for this size, preshifted as needed later
2026 h->maxcode[j] = code << (16-j);
2029 h->maxcode[j] = 0xffffffff;
2031 // build non-spec acceleration table; 255 is flag for not-accelerated
2032 memset(h->fast, 255, 1 << FAST_BITS);
2033 for (i=0; i < k; ++i) {
2035 if (s <= FAST_BITS) {
2036 int c = h->code[i] << (FAST_BITS-s);
2037 int m = 1 << (FAST_BITS-s);
2038 for (j=0; j < m; ++j) {
2039 h->fast[c+j] = (stbi_uc) i;
2046 // build a table that decodes both magnitude and value of small ACs in
2048 static void stbi__build_fast_ac(stbi__int16 *fast_ac, stbi__huffman *h)
2051 for (i=0; i < (1 << FAST_BITS); ++i) {
2052 stbi_uc fast = h->fast[i];
2055 int rs = h->values[fast];
2056 int run = (rs >> 4) & 15;
2057 int magbits = rs & 15;
2058 int len = h->size[fast];
2060 if (magbits && len + magbits <= FAST_BITS) {
2061 // magnitude code followed by receive_extend code
2062 int k = ((i << len) & ((1 << FAST_BITS) - 1)) >> (FAST_BITS - magbits);
2063 int m = 1 << (magbits - 1);
2064 if (k < m) k += (~0U << magbits) + 1;
2065 // if the result is small enough, we can fit it in fast_ac table
2066 if (k >= -128 && k <= 127)
2067 fast_ac[i] = (stbi__int16) ((k * 256) + (run * 16) + (len + magbits));
2073 static void stbi__grow_buffer_unsafe(stbi__jpeg *j)
2076 unsigned int b = j->nomore ? 0 : stbi__get8(j->s);
2078 int c = stbi__get8(j->s);
2079 while (c == 0xff) c = stbi__get8(j->s); // consume fill bytes
2081 j->marker = (unsigned char) c;
2086 j->code_buffer |= b << (24 - j->code_bits);
2088 } while (j->code_bits <= 24);
2092 static const stbi__uint32 stbi__bmask[17]={0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535};
2094 // decode a jpeg huffman value from the bitstream
2095 stbi_inline static int stbi__jpeg_huff_decode(stbi__jpeg *j, stbi__huffman *h)
2100 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
2102 // look at the top FAST_BITS and determine what symbol ID it is,
2103 // if the code is <= FAST_BITS
2104 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
2108 if (s > j->code_bits)
2110 j->code_buffer <<= s;
2112 return h->values[k];
2115 // naive test is to shift the code_buffer down so k bits are
2116 // valid, then test against maxcode. To speed this up, we've
2117 // preshifted maxcode left so that it has (16-k) 0s at the
2118 // end; in other words, regardless of the number of bits, it
2119 // wants to be compared against something shifted to have 16;
2120 // that way we don't need to shift inside the loop.
2121 temp = j->code_buffer >> 16;
2122 for (k=FAST_BITS+1 ; ; ++k)
2123 if (temp < h->maxcode[k])
2126 // error! code not found
2131 if (k > j->code_bits)
2134 // convert the huffman code to the symbol id
2135 c = ((j->code_buffer >> (32 - k)) & stbi__bmask[k]) + h->delta[k];
2136 if(c < 0 || c >= 256) // symbol id out of bounds!
2138 STBI_ASSERT((((j->code_buffer) >> (32 - h->size[c])) & stbi__bmask[h->size[c]]) == h->code[c]);
2140 // convert the id to a symbol
2142 j->code_buffer <<= k;
2143 return h->values[c];
2146 // bias[n] = (-1<<n) + 1
2147 static const int stbi__jbias[16] = {0,-1,-3,-7,-15,-31,-63,-127,-255,-511,-1023,-2047,-4095,-8191,-16383,-32767};
2149 // combined JPEG 'receive' and JPEG 'extend', since baseline
2150 // always extends everything it receives.
2151 stbi_inline static int stbi__extend_receive(stbi__jpeg *j, int n)
2155 if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
2156 if (j->code_bits < n) return 0; // ran out of bits from stream, return 0s intead of continuing
2158 sgn = j->code_buffer >> 31; // sign bit always in MSB; 0 if MSB clear (positive), 1 if MSB set (negative)
2159 k = stbi_lrot(j->code_buffer, n);
2160 j->code_buffer = k & ~stbi__bmask[n];
2161 k &= stbi__bmask[n];
2163 return k + (stbi__jbias[n] & (sgn - 1));
2166 // get some unsigned bits
2167 stbi_inline static int stbi__jpeg_get_bits(stbi__jpeg *j, int n)
2170 if (j->code_bits < n) stbi__grow_buffer_unsafe(j);
2171 if (j->code_bits < n) return 0; // ran out of bits from stream, return 0s intead of continuing
2172 k = stbi_lrot(j->code_buffer, n);
2173 j->code_buffer = k & ~stbi__bmask[n];
2174 k &= stbi__bmask[n];
2179 stbi_inline static int stbi__jpeg_get_bit(stbi__jpeg *j)
2182 if (j->code_bits < 1) stbi__grow_buffer_unsafe(j);
2183 if (j->code_bits < 1) return 0; // ran out of bits from stream, return 0s intead of continuing
2185 j->code_buffer <<= 1;
2187 return k & 0x80000000;
2190 // given a value that's at position X in the zigzag stream,
2191 // where does it appear in the 8x8 matrix coded as row-major?
2192 static const stbi_uc stbi__jpeg_dezigzag[64+15] =
2194 0, 1, 8, 16, 9, 2, 3, 10,
2195 17, 24, 32, 25, 18, 11, 4, 5,
2196 12, 19, 26, 33, 40, 48, 41, 34,
2197 27, 20, 13, 6, 7, 14, 21, 28,
2198 35, 42, 49, 56, 57, 50, 43, 36,
2199 29, 22, 15, 23, 30, 37, 44, 51,
2200 58, 59, 52, 45, 38, 31, 39, 46,
2201 53, 60, 61, 54, 47, 55, 62, 63,
2202 // let corrupt input sample past end
2203 63, 63, 63, 63, 63, 63, 63, 63,
2204 63, 63, 63, 63, 63, 63, 63
2207 // decode one 64-entry block--
2208 static int stbi__jpeg_decode_block(stbi__jpeg *j, short data[64], stbi__huffman *hdc, stbi__huffman *hac, stbi__int16 *fac, int b, stbi__uint16 *dequant)
2213 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
2214 t = stbi__jpeg_huff_decode(j, hdc);
2215 if (t < 0 || t > 15) return stbi__err("bad huffman code","Corrupt JPEG");
2217 // 0 all the ac values now so we can do it 32-bits at a time
2218 memset(data,0,64*sizeof(data[0]));
2220 diff = t ? stbi__extend_receive(j, t) : 0;
2221 if (!stbi__addints_valid(j->img_comp[b].dc_pred, diff)) return stbi__err("bad delta","Corrupt JPEG");
2222 dc = j->img_comp[b].dc_pred + diff;
2223 j->img_comp[b].dc_pred = dc;
2224 if (!stbi__mul2shorts_valid(dc, dequant[0])) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
2225 data[0] = (short) (dc * dequant[0]);
2227 // decode AC components, see JPEG spec
2232 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
2233 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
2235 if (r) { // fast-AC path
2236 k += (r >> 4) & 15; // run
2237 s = r & 15; // combined length
2238 if (s > j->code_bits) return stbi__err("bad huffman code", "Combined length longer than code bits available");
2239 j->code_buffer <<= s;
2241 // decode into unzigzag'd location
2242 zig = stbi__jpeg_dezigzag[k++];
2243 data[zig] = (short) ((r >> 8) * dequant[zig]);
2245 int rs = stbi__jpeg_huff_decode(j, hac);
2246 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
2250 if (rs != 0xf0) break; // end block
2254 // decode into unzigzag'd location
2255 zig = stbi__jpeg_dezigzag[k++];
2256 data[zig] = (short) (stbi__extend_receive(j,s) * dequant[zig]);
2263 static int stbi__jpeg_decode_block_prog_dc(stbi__jpeg *j, short data[64], stbi__huffman *hdc, int b)
2267 if (j->spec_end != 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
2269 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
2271 if (j->succ_high == 0) {
2272 // first scan for DC coefficient, must be first
2273 memset(data,0,64*sizeof(data[0])); // 0 all the ac values now
2274 t = stbi__jpeg_huff_decode(j, hdc);
2275 if (t < 0 || t > 15) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
2276 diff = t ? stbi__extend_receive(j, t) : 0;
2278 if (!stbi__addints_valid(j->img_comp[b].dc_pred, diff)) return stbi__err("bad delta", "Corrupt JPEG");
2279 dc = j->img_comp[b].dc_pred + diff;
2280 j->img_comp[b].dc_pred = dc;
2281 if (!stbi__mul2shorts_valid(dc, 1 << j->succ_low)) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
2282 data[0] = (short) (dc * (1 << j->succ_low));
2284 // refinement scan for DC coefficient
2285 if (stbi__jpeg_get_bit(j))
2286 data[0] += (short) (1 << j->succ_low);
2291 // @OPTIMIZE: store non-zigzagged during the decode passes,
2292 // and only de-zigzag when dequantizing
2293 static int stbi__jpeg_decode_block_prog_ac(stbi__jpeg *j, short data[64], stbi__huffman *hac, stbi__int16 *fac)
2296 if (j->spec_start == 0) return stbi__err("can't merge dc and ac", "Corrupt JPEG");
2298 if (j->succ_high == 0) {
2299 int shift = j->succ_low;
2310 if (j->code_bits < 16) stbi__grow_buffer_unsafe(j);
2311 c = (j->code_buffer >> (32 - FAST_BITS)) & ((1 << FAST_BITS)-1);
2313 if (r) { // fast-AC path
2314 k += (r >> 4) & 15; // run
2315 s = r & 15; // combined length
2316 if (s > j->code_bits) return stbi__err("bad huffman code", "Combined length longer than code bits available");
2317 j->code_buffer <<= s;
2319 zig = stbi__jpeg_dezigzag[k++];
2320 data[zig] = (short) ((r >> 8) * (1 << shift));
2322 int rs = stbi__jpeg_huff_decode(j, hac);
2323 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
2328 j->eob_run = (1 << r);
2330 j->eob_run += stbi__jpeg_get_bits(j, r);
2337 zig = stbi__jpeg_dezigzag[k++];
2338 data[zig] = (short) (stbi__extend_receive(j,s) * (1 << shift));
2341 } while (k <= j->spec_end);
2343 // refinement scan for these AC coefficients
2345 short bit = (short) (1 << j->succ_low);
2349 for (k = j->spec_start; k <= j->spec_end; ++k) {
2350 short *p = &data[stbi__jpeg_dezigzag[k]];
2352 if (stbi__jpeg_get_bit(j))
2353 if ((*p & bit)==0) {
2364 int rs = stbi__jpeg_huff_decode(j, hac); // @OPTIMIZE see if we can use the fast path here, advance-by-r is so slow, eh
2365 if (rs < 0) return stbi__err("bad huffman code","Corrupt JPEG");
2370 j->eob_run = (1 << r) - 1;
2372 j->eob_run += stbi__jpeg_get_bits(j, r);
2373 r = 64; // force end of block
2375 // r=15 s=0 should write 16 0s, so we just do
2376 // a run of 15 0s and then write s (which is 0),
2377 // so we don't have to do anything special here
2380 if (s != 1) return stbi__err("bad huffman code", "Corrupt JPEG");
2382 if (stbi__jpeg_get_bit(j))
2389 while (k <= j->spec_end) {
2390 short *p = &data[stbi__jpeg_dezigzag[k++]];
2392 if (stbi__jpeg_get_bit(j))
2393 if ((*p & bit)==0) {
2407 } while (k <= j->spec_end);
2413 // take a -128..127 value and stbi__clamp it and convert to 0..255
2414 stbi_inline static stbi_uc stbi__clamp(int x)
2416 // trick to use a single test to catch both cases
2417 if ((unsigned int) x > 255) {
2418 if (x < 0) return 0;
2419 if (x > 255) return 255;
2424 #define stbi__f2f(x) ((int) (((x) * 4096 + 0.5)))
2425 #define stbi__fsh(x) ((x) * 4096)
2427 // derived from jidctint -- DCT_ISLOW
2428 #define STBI__IDCT_1D(s0,s1,s2,s3,s4,s5,s6,s7) \
2429 int t0,t1,t2,t3,p1,p2,p3,p4,p5,x0,x1,x2,x3; \
2432 p1 = (p2+p3) * stbi__f2f(0.5411961f); \
2433 t2 = p1 + p3*stbi__f2f(-1.847759065f); \
2434 t3 = p1 + p2*stbi__f2f( 0.765366865f); \
2437 t0 = stbi__fsh(p2+p3); \
2438 t1 = stbi__fsh(p2-p3); \
2451 p5 = (p3+p4)*stbi__f2f( 1.175875602f); \
2452 t0 = t0*stbi__f2f( 0.298631336f); \
2453 t1 = t1*stbi__f2f( 2.053119869f); \
2454 t2 = t2*stbi__f2f( 3.072711026f); \
2455 t3 = t3*stbi__f2f( 1.501321110f); \
2456 p1 = p5 + p1*stbi__f2f(-0.899976223f); \
2457 p2 = p5 + p2*stbi__f2f(-2.562915447f); \
2458 p3 = p3*stbi__f2f(-1.961570560f); \
2459 p4 = p4*stbi__f2f(-0.390180644f); \
2465 static void stbi__idct_block(stbi_uc *out, int out_stride, short data[64])
2467 int i,val[64],*v=val;
2472 for (i=0; i < 8; ++i,++d, ++v) {
2473 // if all zeroes, shortcut -- this avoids dequantizing 0s and IDCTing
2474 if (d[ 8]==0 && d[16]==0 && d[24]==0 && d[32]==0
2475 && d[40]==0 && d[48]==0 && d[56]==0) {
2476 // no shortcut 0 seconds
2477 // (1|2|3|4|5|6|7)==0 0 seconds
2478 // all separate -0.047 seconds
2479 // 1 && 2|3 && 4|5 && 6|7: -0.047 seconds
2480 int dcterm = d[0]*4;
2481 v[0] = v[8] = v[16] = v[24] = v[32] = v[40] = v[48] = v[56] = dcterm;
2483 STBI__IDCT_1D(d[ 0],d[ 8],d[16],d[24],d[32],d[40],d[48],d[56])
2484 // constants scaled things up by 1<<12; let's bring them back
2485 // down, but keep 2 extra bits of precision
2486 x0 += 512; x1 += 512; x2 += 512; x3 += 512;
2487 v[ 0] = (x0+t3) >> 10;
2488 v[56] = (x0-t3) >> 10;
2489 v[ 8] = (x1+t2) >> 10;
2490 v[48] = (x1-t2) >> 10;
2491 v[16] = (x2+t1) >> 10;
2492 v[40] = (x2-t1) >> 10;
2493 v[24] = (x3+t0) >> 10;
2494 v[32] = (x3-t0) >> 10;
2498 for (i=0, v=val, o=out; i < 8; ++i,v+=8,o+=out_stride) {
2499 // no fast case since the first 1D IDCT spread components out
2500 STBI__IDCT_1D(v[0],v[1],v[2],v[3],v[4],v[5],v[6],v[7])
2501 // constants scaled things up by 1<<12, plus we had 1<<2 from first
2502 // loop, plus horizontal and vertical each scale by sqrt(8) so together
2503 // we've got an extra 1<<3, so 1<<17 total we need to remove.
2504 // so we want to round that, which means adding 0.5 * 1<<17,
2505 // aka 65536. Also, we'll end up with -128 to 127 that we want
2506 // to encode as 0..255 by adding 128, so we'll add that before the shift
2507 x0 += 65536 + (128<<17);
2508 x1 += 65536 + (128<<17);
2509 x2 += 65536 + (128<<17);
2510 x3 += 65536 + (128<<17);
2511 // tried computing the shifts into temps, or'ing the temps to see
2512 // if any were out of range, but that was slower
2513 o[0] = stbi__clamp((x0+t3) >> 17);
2514 o[7] = stbi__clamp((x0-t3) >> 17);
2515 o[1] = stbi__clamp((x1+t2) >> 17);
2516 o[6] = stbi__clamp((x1-t2) >> 17);
2517 o[2] = stbi__clamp((x2+t1) >> 17);
2518 o[5] = stbi__clamp((x2-t1) >> 17);
2519 o[3] = stbi__clamp((x3+t0) >> 17);
2520 o[4] = stbi__clamp((x3-t0) >> 17);
2525 // sse2 integer IDCT. not the fastest possible implementation but it
2526 // produces bit-identical results to the generic C version so it's
2527 // fully "transparent".
2528 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2530 // This is constructed to match our regular (generic) integer IDCT exactly.
2531 __m128i row0, row1, row2, row3, row4, row5, row6, row7;
2534 // dot product constant: even elems=x, odd elems=y
2535 #define dct_const(x,y) _mm_setr_epi16((x),(y),(x),(y),(x),(y),(x),(y))
2537 // out(0) = c0[even]*x + c0[odd]*y (c0, x, y 16-bit, out 32-bit)
2538 // out(1) = c1[even]*x + c1[odd]*y
2539 #define dct_rot(out0,out1, x,y,c0,c1) \
2540 __m128i c0##lo = _mm_unpacklo_epi16((x),(y)); \
2541 __m128i c0##hi = _mm_unpackhi_epi16((x),(y)); \
2542 __m128i out0##_l = _mm_madd_epi16(c0##lo, c0); \
2543 __m128i out0##_h = _mm_madd_epi16(c0##hi, c0); \
2544 __m128i out1##_l = _mm_madd_epi16(c0##lo, c1); \
2545 __m128i out1##_h = _mm_madd_epi16(c0##hi, c1)
2547 // out = in << 12 (in 16-bit, out 32-bit)
2548 #define dct_widen(out, in) \
2549 __m128i out##_l = _mm_srai_epi32(_mm_unpacklo_epi16(_mm_setzero_si128(), (in)), 4); \
2550 __m128i out##_h = _mm_srai_epi32(_mm_unpackhi_epi16(_mm_setzero_si128(), (in)), 4)
2553 #define dct_wadd(out, a, b) \
2554 __m128i out##_l = _mm_add_epi32(a##_l, b##_l); \
2555 __m128i out##_h = _mm_add_epi32(a##_h, b##_h)
2558 #define dct_wsub(out, a, b) \
2559 __m128i out##_l = _mm_sub_epi32(a##_l, b##_l); \
2560 __m128i out##_h = _mm_sub_epi32(a##_h, b##_h)
2562 // butterfly a/b, add bias, then shift by "s" and pack
2563 #define dct_bfly32o(out0, out1, a,b,bias,s) \
2565 __m128i abiased_l = _mm_add_epi32(a##_l, bias); \
2566 __m128i abiased_h = _mm_add_epi32(a##_h, bias); \
2567 dct_wadd(sum, abiased, b); \
2568 dct_wsub(dif, abiased, b); \
2569 out0 = _mm_packs_epi32(_mm_srai_epi32(sum_l, s), _mm_srai_epi32(sum_h, s)); \
2570 out1 = _mm_packs_epi32(_mm_srai_epi32(dif_l, s), _mm_srai_epi32(dif_h, s)); \
2573 // 8-bit interleave step (for transposes)
2574 #define dct_interleave8(a, b) \
2576 a = _mm_unpacklo_epi8(a, b); \
2577 b = _mm_unpackhi_epi8(tmp, b)
2579 // 16-bit interleave step (for transposes)
2580 #define dct_interleave16(a, b) \
2582 a = _mm_unpacklo_epi16(a, b); \
2583 b = _mm_unpackhi_epi16(tmp, b)
2585 #define dct_pass(bias,shift) \
2588 dct_rot(t2e,t3e, row2,row6, rot0_0,rot0_1); \
2589 __m128i sum04 = _mm_add_epi16(row0, row4); \
2590 __m128i dif04 = _mm_sub_epi16(row0, row4); \
2591 dct_widen(t0e, sum04); \
2592 dct_widen(t1e, dif04); \
2593 dct_wadd(x0, t0e, t3e); \
2594 dct_wsub(x3, t0e, t3e); \
2595 dct_wadd(x1, t1e, t2e); \
2596 dct_wsub(x2, t1e, t2e); \
2598 dct_rot(y0o,y2o, row7,row3, rot2_0,rot2_1); \
2599 dct_rot(y1o,y3o, row5,row1, rot3_0,rot3_1); \
2600 __m128i sum17 = _mm_add_epi16(row1, row7); \
2601 __m128i sum35 = _mm_add_epi16(row3, row5); \
2602 dct_rot(y4o,y5o, sum17,sum35, rot1_0,rot1_1); \
2603 dct_wadd(x4, y0o, y4o); \
2604 dct_wadd(x5, y1o, y5o); \
2605 dct_wadd(x6, y2o, y5o); \
2606 dct_wadd(x7, y3o, y4o); \
2607 dct_bfly32o(row0,row7, x0,x7,bias,shift); \
2608 dct_bfly32o(row1,row6, x1,x6,bias,shift); \
2609 dct_bfly32o(row2,row5, x2,x5,bias,shift); \
2610 dct_bfly32o(row3,row4, x3,x4,bias,shift); \
2613 __m128i rot0_0 = dct_const(stbi__f2f(0.5411961f), stbi__f2f(0.5411961f) + stbi__f2f(-1.847759065f));
2614 __m128i rot0_1 = dct_const(stbi__f2f(0.5411961f) + stbi__f2f( 0.765366865f), stbi__f2f(0.5411961f));
2615 __m128i rot1_0 = dct_const(stbi__f2f(1.175875602f) + stbi__f2f(-0.899976223f), stbi__f2f(1.175875602f));
2616 __m128i rot1_1 = dct_const(stbi__f2f(1.175875602f), stbi__f2f(1.175875602f) + stbi__f2f(-2.562915447f));
2617 __m128i rot2_0 = dct_const(stbi__f2f(-1.961570560f) + stbi__f2f( 0.298631336f), stbi__f2f(-1.961570560f));
2618 __m128i rot2_1 = dct_const(stbi__f2f(-1.961570560f), stbi__f2f(-1.961570560f) + stbi__f2f( 3.072711026f));
2619 __m128i rot3_0 = dct_const(stbi__f2f(-0.390180644f) + stbi__f2f( 2.053119869f), stbi__f2f(-0.390180644f));
2620 __m128i rot3_1 = dct_const(stbi__f2f(-0.390180644f), stbi__f2f(-0.390180644f) + stbi__f2f( 1.501321110f));
2622 // rounding biases in column/row passes, see stbi__idct_block for explanation.
2623 __m128i bias_0 = _mm_set1_epi32(512);
2624 __m128i bias_1 = _mm_set1_epi32(65536 + (128<<17));
2627 row0 = _mm_load_si128((const __m128i *) (data + 0*8));
2628 row1 = _mm_load_si128((const __m128i *) (data + 1*8));
2629 row2 = _mm_load_si128((const __m128i *) (data + 2*8));
2630 row3 = _mm_load_si128((const __m128i *) (data + 3*8));
2631 row4 = _mm_load_si128((const __m128i *) (data + 4*8));
2632 row5 = _mm_load_si128((const __m128i *) (data + 5*8));
2633 row6 = _mm_load_si128((const __m128i *) (data + 6*8));
2634 row7 = _mm_load_si128((const __m128i *) (data + 7*8));
2637 dct_pass(bias_0, 10);
2640 // 16bit 8x8 transpose pass 1
2641 dct_interleave16(row0, row4);
2642 dct_interleave16(row1, row5);
2643 dct_interleave16(row2, row6);
2644 dct_interleave16(row3, row7);
2647 dct_interleave16(row0, row2);
2648 dct_interleave16(row1, row3);
2649 dct_interleave16(row4, row6);
2650 dct_interleave16(row5, row7);
2653 dct_interleave16(row0, row1);
2654 dct_interleave16(row2, row3);
2655 dct_interleave16(row4, row5);
2656 dct_interleave16(row6, row7);
2660 dct_pass(bias_1, 17);
2664 __m128i p0 = _mm_packus_epi16(row0, row1); // a0a1a2a3...a7b0b1b2b3...b7
2665 __m128i p1 = _mm_packus_epi16(row2, row3);
2666 __m128i p2 = _mm_packus_epi16(row4, row5);
2667 __m128i p3 = _mm_packus_epi16(row6, row7);
2669 // 8bit 8x8 transpose pass 1
2670 dct_interleave8(p0, p2); // a0e0a1e1...
2671 dct_interleave8(p1, p3); // c0g0c1g1...
2674 dct_interleave8(p0, p1); // a0c0e0g0...
2675 dct_interleave8(p2, p3); // b0d0f0h0...
2678 dct_interleave8(p0, p2); // a0b0c0d0...
2679 dct_interleave8(p1, p3); // a4b4c4d4...
2682 _mm_storel_epi64((__m128i *) out, p0); out += out_stride;
2683 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p0, 0x4e)); out += out_stride;
2684 _mm_storel_epi64((__m128i *) out, p2); out += out_stride;
2685 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p2, 0x4e)); out += out_stride;
2686 _mm_storel_epi64((__m128i *) out, p1); out += out_stride;
2687 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p1, 0x4e)); out += out_stride;
2688 _mm_storel_epi64((__m128i *) out, p3); out += out_stride;
2689 _mm_storel_epi64((__m128i *) out, _mm_shuffle_epi32(p3, 0x4e));
2698 #undef dct_interleave8
2699 #undef dct_interleave16
2707 // NEON integer IDCT. should produce bit-identical
2708 // results to the generic C version.
2709 static void stbi__idct_simd(stbi_uc *out, int out_stride, short data[64])
2711 int16x8_t row0, row1, row2, row3, row4, row5, row6, row7;
2713 int16x4_t rot0_0 = vdup_n_s16(stbi__f2f(0.5411961f));
2714 int16x4_t rot0_1 = vdup_n_s16(stbi__f2f(-1.847759065f));
2715 int16x4_t rot0_2 = vdup_n_s16(stbi__f2f( 0.765366865f));
2716 int16x4_t rot1_0 = vdup_n_s16(stbi__f2f( 1.175875602f));
2717 int16x4_t rot1_1 = vdup_n_s16(stbi__f2f(-0.899976223f));
2718 int16x4_t rot1_2 = vdup_n_s16(stbi__f2f(-2.562915447f));
2719 int16x4_t rot2_0 = vdup_n_s16(stbi__f2f(-1.961570560f));
2720 int16x4_t rot2_1 = vdup_n_s16(stbi__f2f(-0.390180644f));
2721 int16x4_t rot3_0 = vdup_n_s16(stbi__f2f( 0.298631336f));
2722 int16x4_t rot3_1 = vdup_n_s16(stbi__f2f( 2.053119869f));
2723 int16x4_t rot3_2 = vdup_n_s16(stbi__f2f( 3.072711026f));
2724 int16x4_t rot3_3 = vdup_n_s16(stbi__f2f( 1.501321110f));
2726 #define dct_long_mul(out, inq, coeff) \
2727 int32x4_t out##_l = vmull_s16(vget_low_s16(inq), coeff); \
2728 int32x4_t out##_h = vmull_s16(vget_high_s16(inq), coeff)
2730 #define dct_long_mac(out, acc, inq, coeff) \
2731 int32x4_t out##_l = vmlal_s16(acc##_l, vget_low_s16(inq), coeff); \
2732 int32x4_t out##_h = vmlal_s16(acc##_h, vget_high_s16(inq), coeff)
2734 #define dct_widen(out, inq) \
2735 int32x4_t out##_l = vshll_n_s16(vget_low_s16(inq), 12); \
2736 int32x4_t out##_h = vshll_n_s16(vget_high_s16(inq), 12)
2739 #define dct_wadd(out, a, b) \
2740 int32x4_t out##_l = vaddq_s32(a##_l, b##_l); \
2741 int32x4_t out##_h = vaddq_s32(a##_h, b##_h)
2744 #define dct_wsub(out, a, b) \
2745 int32x4_t out##_l = vsubq_s32(a##_l, b##_l); \
2746 int32x4_t out##_h = vsubq_s32(a##_h, b##_h)
2748 // butterfly a/b, then shift using "shiftop" by "s" and pack
2749 #define dct_bfly32o(out0,out1, a,b,shiftop,s) \
2751 dct_wadd(sum, a, b); \
2752 dct_wsub(dif, a, b); \
2753 out0 = vcombine_s16(shiftop(sum_l, s), shiftop(sum_h, s)); \
2754 out1 = vcombine_s16(shiftop(dif_l, s), shiftop(dif_h, s)); \
2757 #define dct_pass(shiftop, shift) \
2760 int16x8_t sum26 = vaddq_s16(row2, row6); \
2761 dct_long_mul(p1e, sum26, rot0_0); \
2762 dct_long_mac(t2e, p1e, row6, rot0_1); \
2763 dct_long_mac(t3e, p1e, row2, rot0_2); \
2764 int16x8_t sum04 = vaddq_s16(row0, row4); \
2765 int16x8_t dif04 = vsubq_s16(row0, row4); \
2766 dct_widen(t0e, sum04); \
2767 dct_widen(t1e, dif04); \
2768 dct_wadd(x0, t0e, t3e); \
2769 dct_wsub(x3, t0e, t3e); \
2770 dct_wadd(x1, t1e, t2e); \
2771 dct_wsub(x2, t1e, t2e); \
2773 int16x8_t sum15 = vaddq_s16(row1, row5); \
2774 int16x8_t sum17 = vaddq_s16(row1, row7); \
2775 int16x8_t sum35 = vaddq_s16(row3, row5); \
2776 int16x8_t sum37 = vaddq_s16(row3, row7); \
2777 int16x8_t sumodd = vaddq_s16(sum17, sum35); \
2778 dct_long_mul(p5o, sumodd, rot1_0); \
2779 dct_long_mac(p1o, p5o, sum17, rot1_1); \
2780 dct_long_mac(p2o, p5o, sum35, rot1_2); \
2781 dct_long_mul(p3o, sum37, rot2_0); \
2782 dct_long_mul(p4o, sum15, rot2_1); \
2783 dct_wadd(sump13o, p1o, p3o); \
2784 dct_wadd(sump24o, p2o, p4o); \
2785 dct_wadd(sump23o, p2o, p3o); \
2786 dct_wadd(sump14o, p1o, p4o); \
2787 dct_long_mac(x4, sump13o, row7, rot3_0); \
2788 dct_long_mac(x5, sump24o, row5, rot3_1); \
2789 dct_long_mac(x6, sump23o, row3, rot3_2); \
2790 dct_long_mac(x7, sump14o, row1, rot3_3); \
2791 dct_bfly32o(row0,row7, x0,x7,shiftop,shift); \
2792 dct_bfly32o(row1,row6, x1,x6,shiftop,shift); \
2793 dct_bfly32o(row2,row5, x2,x5,shiftop,shift); \
2794 dct_bfly32o(row3,row4, x3,x4,shiftop,shift); \
2798 row0 = vld1q_s16(data + 0*8);
2799 row1 = vld1q_s16(data + 1*8);
2800 row2 = vld1q_s16(data + 2*8);
2801 row3 = vld1q_s16(data + 3*8);
2802 row4 = vld1q_s16(data + 4*8);
2803 row5 = vld1q_s16(data + 5*8);
2804 row6 = vld1q_s16(data + 6*8);
2805 row7 = vld1q_s16(data + 7*8);
2808 row0 = vaddq_s16(row0, vsetq_lane_s16(1024, vdupq_n_s16(0), 0));
2811 dct_pass(vrshrn_n_s32, 10);
2813 // 16bit 8x8 transpose
2815 // these three map to a single VTRN.16, VTRN.32, and VSWP, respectively.
2816 // whether compilers actually get this is another story, sadly.
2817 #define dct_trn16(x, y) { int16x8x2_t t = vtrnq_s16(x, y); x = t.val[0]; y = t.val[1]; }
2818 #define dct_trn32(x, y) { int32x4x2_t t = vtrnq_s32(vreinterpretq_s32_s16(x), vreinterpretq_s32_s16(y)); x = vreinterpretq_s16_s32(t.val[0]); y = vreinterpretq_s16_s32(t.val[1]); }
2819 #define dct_trn64(x, y) { int16x8_t x0 = x; int16x8_t y0 = y; x = vcombine_s16(vget_low_s16(x0), vget_low_s16(y0)); y = vcombine_s16(vget_high_s16(x0), vget_high_s16(y0)); }
2822 dct_trn16(row0, row1); // a0b0a2b2a4b4a6b6
2823 dct_trn16(row2, row3);
2824 dct_trn16(row4, row5);
2825 dct_trn16(row6, row7);
2828 dct_trn32(row0, row2); // a0b0c0d0a4b4c4d4
2829 dct_trn32(row1, row3);
2830 dct_trn32(row4, row6);
2831 dct_trn32(row5, row7);
2834 dct_trn64(row0, row4); // a0b0c0d0e0f0g0h0
2835 dct_trn64(row1, row5);
2836 dct_trn64(row2, row6);
2837 dct_trn64(row3, row7);
2845 // vrshrn_n_s32 only supports shifts up to 16, we need
2846 // 17. so do a non-rounding shift of 16 first then follow
2847 // up with a rounding shift by 1.
2848 dct_pass(vshrn_n_s32, 16);
2852 uint8x8_t p0 = vqrshrun_n_s16(row0, 1);
2853 uint8x8_t p1 = vqrshrun_n_s16(row1, 1);
2854 uint8x8_t p2 = vqrshrun_n_s16(row2, 1);
2855 uint8x8_t p3 = vqrshrun_n_s16(row3, 1);
2856 uint8x8_t p4 = vqrshrun_n_s16(row4, 1);
2857 uint8x8_t p5 = vqrshrun_n_s16(row5, 1);
2858 uint8x8_t p6 = vqrshrun_n_s16(row6, 1);
2859 uint8x8_t p7 = vqrshrun_n_s16(row7, 1);
2861 // again, these can translate into one instruction, but often don't.
2862 #define dct_trn8_8(x, y) { uint8x8x2_t t = vtrn_u8(x, y); x = t.val[0]; y = t.val[1]; }
2863 #define dct_trn8_16(x, y) { uint16x4x2_t t = vtrn_u16(vreinterpret_u16_u8(x), vreinterpret_u16_u8(y)); x = vreinterpret_u8_u16(t.val[0]); y = vreinterpret_u8_u16(t.val[1]); }
2864 #define dct_trn8_32(x, y) { uint32x2x2_t t = vtrn_u32(vreinterpret_u32_u8(x), vreinterpret_u32_u8(y)); x = vreinterpret_u8_u32(t.val[0]); y = vreinterpret_u8_u32(t.val[1]); }
2866 // sadly can't use interleaved stores here since we only write
2867 // 8 bytes to each scan line!
2869 // 8x8 8-bit transpose pass 1
2876 dct_trn8_16(p0, p2);
2877 dct_trn8_16(p1, p3);
2878 dct_trn8_16(p4, p6);
2879 dct_trn8_16(p5, p7);
2882 dct_trn8_32(p0, p4);
2883 dct_trn8_32(p1, p5);
2884 dct_trn8_32(p2, p6);
2885 dct_trn8_32(p3, p7);
2888 vst1_u8(out, p0); out += out_stride;
2889 vst1_u8(out, p1); out += out_stride;
2890 vst1_u8(out, p2); out += out_stride;
2891 vst1_u8(out, p3); out += out_stride;
2892 vst1_u8(out, p4); out += out_stride;
2893 vst1_u8(out, p5); out += out_stride;
2894 vst1_u8(out, p6); out += out_stride;
2913 #define STBI__MARKER_none 0xff
2914 // if there's a pending marker from the entropy stream, return that
2915 // otherwise, fetch from the stream and get a marker. if there's no
2916 // marker, return 0xff, which is never a valid marker value
2917 static stbi_uc stbi__get_marker(stbi__jpeg *j)
2920 if (j->marker != STBI__MARKER_none) { x = j->marker; j->marker = STBI__MARKER_none; return x; }
2921 x = stbi__get8(j->s);
2922 if (x != 0xff) return STBI__MARKER_none;
2924 x = stbi__get8(j->s); // consume repeated 0xff fill bytes
2928 // in each scan, we'll have scan_n components, and the order
2929 // of the components is specified by order[]
2930 #define STBI__RESTART(x) ((x) >= 0xd0 && (x) <= 0xd7)
2932 // after a restart interval, stbi__jpeg_reset the entropy decoder and
2933 // the dc prediction
2934 static void stbi__jpeg_reset(stbi__jpeg *j)
2939 j->img_comp[0].dc_pred = j->img_comp[1].dc_pred = j->img_comp[2].dc_pred = j->img_comp[3].dc_pred = 0;
2940 j->marker = STBI__MARKER_none;
2941 j->todo = j->restart_interval ? j->restart_interval : 0x7fffffff;
2943 // no more than 1<<31 MCUs if no restart_interal? that's plenty safe,
2944 // since we don't even allow 1<<30 pixels
2947 static int stbi__parse_entropy_coded_data(stbi__jpeg *z)
2949 stbi__jpeg_reset(z);
2950 if (!z->progressive) {
2951 if (z->scan_n == 1) {
2953 STBI_SIMD_ALIGN(short, data[64]);
2954 int n = z->order[0];
2955 // non-interleaved data, we just need to process one block at a time,
2956 // in trivial scanline order
2957 // number of blocks to do just depends on how many actual "pixels" this
2958 // component has, independent of interleaved MCU blocking and such
2959 int w = (z->img_comp[n].x+7) >> 3;
2960 int h = (z->img_comp[n].y+7) >> 3;
2961 for (j=0; j < h; ++j) {
2962 for (i=0; i < w; ++i) {
2963 int ha = z->img_comp[n].ha;
2964 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2965 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
2966 // every data block is an MCU, so countdown the restart interval
2967 if (--z->todo <= 0) {
2968 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
2969 // if it's NOT a restart, then just bail, so we get corrupt data
2970 // rather than no data
2971 if (!STBI__RESTART(z->marker)) return 1;
2972 stbi__jpeg_reset(z);
2977 } else { // interleaved
2979 STBI_SIMD_ALIGN(short, data[64]);
2980 for (j=0; j < z->img_mcu_y; ++j) {
2981 for (i=0; i < z->img_mcu_x; ++i) {
2982 // scan an interleaved mcu... process scan_n components in order
2983 for (k=0; k < z->scan_n; ++k) {
2984 int n = z->order[k];
2985 // scan out an mcu's worth of this component; that's just determined
2986 // by the basic H and V specified for the component
2987 for (y=0; y < z->img_comp[n].v; ++y) {
2988 for (x=0; x < z->img_comp[n].h; ++x) {
2989 int x2 = (i*z->img_comp[n].h + x)*8;
2990 int y2 = (j*z->img_comp[n].v + y)*8;
2991 int ha = z->img_comp[n].ha;
2992 if (!stbi__jpeg_decode_block(z, data, z->huff_dc+z->img_comp[n].hd, z->huff_ac+ha, z->fast_ac[ha], n, z->dequant[z->img_comp[n].tq])) return 0;
2993 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*y2+x2, z->img_comp[n].w2, data);
2997 // after all interleaved components, that's an interleaved MCU,
2998 // so now count down the restart interval
2999 if (--z->todo <= 0) {
3000 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
3001 if (!STBI__RESTART(z->marker)) return 1;
3002 stbi__jpeg_reset(z);
3009 if (z->scan_n == 1) {
3011 int n = z->order[0];
3012 // non-interleaved data, we just need to process one block at a time,
3013 // in trivial scanline order
3014 // number of blocks to do just depends on how many actual "pixels" this
3015 // component has, independent of interleaved MCU blocking and such
3016 int w = (z->img_comp[n].x+7) >> 3;
3017 int h = (z->img_comp[n].y+7) >> 3;
3018 for (j=0; j < h; ++j) {
3019 for (i=0; i < w; ++i) {
3020 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
3021 if (z->spec_start == 0) {
3022 if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
3025 int ha = z->img_comp[n].ha;
3026 if (!stbi__jpeg_decode_block_prog_ac(z, data, &z->huff_ac[ha], z->fast_ac[ha]))
3029 // every data block is an MCU, so countdown the restart interval
3030 if (--z->todo <= 0) {
3031 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
3032 if (!STBI__RESTART(z->marker)) return 1;
3033 stbi__jpeg_reset(z);
3038 } else { // interleaved
3040 for (j=0; j < z->img_mcu_y; ++j) {
3041 for (i=0; i < z->img_mcu_x; ++i) {
3042 // scan an interleaved mcu... process scan_n components in order
3043 for (k=0; k < z->scan_n; ++k) {
3044 int n = z->order[k];
3045 // scan out an mcu's worth of this component; that's just determined
3046 // by the basic H and V specified for the component
3047 for (y=0; y < z->img_comp[n].v; ++y) {
3048 for (x=0; x < z->img_comp[n].h; ++x) {
3049 int x2 = (i*z->img_comp[n].h + x);
3050 int y2 = (j*z->img_comp[n].v + y);
3051 short *data = z->img_comp[n].coeff + 64 * (x2 + y2 * z->img_comp[n].coeff_w);
3052 if (!stbi__jpeg_decode_block_prog_dc(z, data, &z->huff_dc[z->img_comp[n].hd], n))
3057 // after all interleaved components, that's an interleaved MCU,
3058 // so now count down the restart interval
3059 if (--z->todo <= 0) {
3060 if (z->code_bits < 24) stbi__grow_buffer_unsafe(z);
3061 if (!STBI__RESTART(z->marker)) return 1;
3062 stbi__jpeg_reset(z);
3071 static void stbi__jpeg_dequantize(short *data, stbi__uint16 *dequant)
3074 for (i=0; i < 64; ++i)
3075 data[i] *= dequant[i];
3078 static void stbi__jpeg_finish(stbi__jpeg *z)
3080 if (z->progressive) {
3081 // dequantize and idct the data
3083 for (n=0; n < z->s->img_n; ++n) {
3084 int w = (z->img_comp[n].x+7) >> 3;
3085 int h = (z->img_comp[n].y+7) >> 3;
3086 for (j=0; j < h; ++j) {
3087 for (i=0; i < w; ++i) {
3088 short *data = z->img_comp[n].coeff + 64 * (i + j * z->img_comp[n].coeff_w);
3089 stbi__jpeg_dequantize(data, z->dequant[z->img_comp[n].tq]);
3090 z->idct_block_kernel(z->img_comp[n].data+z->img_comp[n].w2*j*8+i*8, z->img_comp[n].w2, data);
3097 static int stbi__process_marker(stbi__jpeg *z, int m)
3101 case STBI__MARKER_none: // no marker found
3102 return stbi__err("expected marker","Corrupt JPEG");
3104 case 0xDD: // DRI - specify restart interval
3105 if (stbi__get16be(z->s) != 4) return stbi__err("bad DRI len","Corrupt JPEG");
3106 z->restart_interval = stbi__get16be(z->s);
3109 case 0xDB: // DQT - define quantization table
3110 L = stbi__get16be(z->s)-2;
3112 int q = stbi__get8(z->s);
3113 int p = q >> 4, sixteen = (p != 0);
3115 if (p != 0 && p != 1) return stbi__err("bad DQT type","Corrupt JPEG");
3116 if (t > 3) return stbi__err("bad DQT table","Corrupt JPEG");
3118 for (i=0; i < 64; ++i)
3119 z->dequant[t][stbi__jpeg_dezigzag[i]] = (stbi__uint16)(sixteen ? stbi__get16be(z->s) : stbi__get8(z->s));
3120 L -= (sixteen ? 129 : 65);
3124 case 0xC4: // DHT - define huffman table
3125 L = stbi__get16be(z->s)-2;
3128 int sizes[16],i,n=0;
3129 int q = stbi__get8(z->s);
3132 if (tc > 1 || th > 3) return stbi__err("bad DHT header","Corrupt JPEG");
3133 for (i=0; i < 16; ++i) {
3134 sizes[i] = stbi__get8(z->s);
3137 if(n > 256) return stbi__err("bad DHT header","Corrupt JPEG"); // Loop over i < n would write past end of values!
3140 if (!stbi__build_huffman(z->huff_dc+th, sizes)) return 0;
3141 v = z->huff_dc[th].values;
3143 if (!stbi__build_huffman(z->huff_ac+th, sizes)) return 0;
3144 v = z->huff_ac[th].values;
3146 for (i=0; i < n; ++i)
3147 v[i] = stbi__get8(z->s);
3149 stbi__build_fast_ac(z->fast_ac[th], z->huff_ac + th);
3155 // check for comment block or APP blocks
3156 if ((m >= 0xE0 && m <= 0xEF) || m == 0xFE) {
3157 L = stbi__get16be(z->s);
3160 return stbi__err("bad COM len","Corrupt JPEG");
3162 return stbi__err("bad APP len","Corrupt JPEG");
3166 if (m == 0xE0 && L >= 5) { // JFIF APP0 segment
3167 static const unsigned char tag[5] = {'J','F','I','F','\0'};
3170 for (i=0; i < 5; ++i)
3171 if (stbi__get8(z->s) != tag[i])
3176 } else if (m == 0xEE && L >= 12) { // Adobe APP14 segment
3177 static const unsigned char tag[6] = {'A','d','o','b','e','\0'};
3180 for (i=0; i < 6; ++i)
3181 if (stbi__get8(z->s) != tag[i])
3185 stbi__get8(z->s); // version
3186 stbi__get16be(z->s); // flags0
3187 stbi__get16be(z->s); // flags1
3188 z->app14_color_transform = stbi__get8(z->s); // color transform
3193 stbi__skip(z->s, L);
3197 return stbi__err("unknown marker","Corrupt JPEG");
3201 static int stbi__process_scan_header(stbi__jpeg *z)
3204 int Ls = stbi__get16be(z->s);
3205 z->scan_n = stbi__get8(z->s);
3206 if (z->scan_n < 1 || z->scan_n > 4 || z->scan_n > (int) z->s->img_n) return stbi__err("bad SOS component count","Corrupt JPEG");
3207 if (Ls != 6+2*z->scan_n) return stbi__err("bad SOS len","Corrupt JPEG");
3208 for (i=0; i < z->scan_n; ++i) {
3209 int id = stbi__get8(z->s), which;
3210 int q = stbi__get8(z->s);
3211 for (which = 0; which < z->s->img_n; ++which)
3212 if (z->img_comp[which].id == id)
3214 if (which == z->s->img_n) return 0; // no match
3215 z->img_comp[which].hd = q >> 4; if (z->img_comp[which].hd > 3) return stbi__err("bad DC huff","Corrupt JPEG");
3216 z->img_comp[which].ha = q & 15; if (z->img_comp[which].ha > 3) return stbi__err("bad AC huff","Corrupt JPEG");
3217 z->order[i] = which;
3222 z->spec_start = stbi__get8(z->s);
3223 z->spec_end = stbi__get8(z->s); // should be 63, but might be 0
3224 aa = stbi__get8(z->s);
3225 z->succ_high = (aa >> 4);
3226 z->succ_low = (aa & 15);
3227 if (z->progressive) {
3228 if (z->spec_start > 63 || z->spec_end > 63 || z->spec_start > z->spec_end || z->succ_high > 13 || z->succ_low > 13)
3229 return stbi__err("bad SOS", "Corrupt JPEG");
3231 if (z->spec_start != 0) return stbi__err("bad SOS","Corrupt JPEG");
3232 if (z->succ_high != 0 || z->succ_low != 0) return stbi__err("bad SOS","Corrupt JPEG");
3240 static int stbi__free_jpeg_components(stbi__jpeg *z, int ncomp, int why)
3243 for (i=0; i < ncomp; ++i) {
3244 if (z->img_comp[i].raw_data) {
3245 STBI_FREE(z->img_comp[i].raw_data);
3246 z->img_comp[i].raw_data = NULL;
3247 z->img_comp[i].data = NULL;
3249 if (z->img_comp[i].raw_coeff) {
3250 STBI_FREE(z->img_comp[i].raw_coeff);
3251 z->img_comp[i].raw_coeff = 0;
3252 z->img_comp[i].coeff = 0;
3254 if (z->img_comp[i].linebuf) {
3255 STBI_FREE(z->img_comp[i].linebuf);
3256 z->img_comp[i].linebuf = NULL;
3262 static int stbi__process_frame_header(stbi__jpeg *z, int scan)
3264 stbi__context *s = z->s;
3265 int Lf,p,i,q, h_max=1,v_max=1,c;
3266 Lf = stbi__get16be(s); if (Lf < 11) return stbi__err("bad SOF len","Corrupt JPEG"); // JPEG
3267 p = stbi__get8(s); if (p != 8) return stbi__err("only 8-bit","JPEG format not supported: 8-bit only"); // JPEG baseline
3268 s->img_y = stbi__get16be(s); if (s->img_y == 0) return stbi__err("no header height", "JPEG format not supported: delayed height"); // Legal, but we don't handle it--but neither does IJG
3269 s->img_x = stbi__get16be(s); if (s->img_x == 0) return stbi__err("0 width","Corrupt JPEG"); // JPEG requires
3270 if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
3271 if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
3273 if (c != 3 && c != 1 && c != 4) return stbi__err("bad component count","Corrupt JPEG");
3275 for (i=0; i < c; ++i) {
3276 z->img_comp[i].data = NULL;
3277 z->img_comp[i].linebuf = NULL;
3280 if (Lf != 8+3*s->img_n) return stbi__err("bad SOF len","Corrupt JPEG");
3283 for (i=0; i < s->img_n; ++i) {
3284 static const unsigned char rgb[3] = { 'R', 'G', 'B' };
3285 z->img_comp[i].id = stbi__get8(s);
3286 if (s->img_n == 3 && z->img_comp[i].id == rgb[i])
3289 z->img_comp[i].h = (q >> 4); if (!z->img_comp[i].h || z->img_comp[i].h > 4) return stbi__err("bad H","Corrupt JPEG");
3290 z->img_comp[i].v = q & 15; if (!z->img_comp[i].v || z->img_comp[i].v > 4) return stbi__err("bad V","Corrupt JPEG");
3291 z->img_comp[i].tq = stbi__get8(s); if (z->img_comp[i].tq > 3) return stbi__err("bad TQ","Corrupt JPEG");
3294 if (scan != STBI__SCAN_load) return 1;
3296 if (!stbi__mad3sizes_valid(s->img_x, s->img_y, s->img_n, 0)) return stbi__err("too large", "Image too large to decode");
3298 for (i=0; i < s->img_n; ++i) {
3299 if (z->img_comp[i].h > h_max) h_max = z->img_comp[i].h;
3300 if (z->img_comp[i].v > v_max) v_max = z->img_comp[i].v;
3303 // check that plane subsampling factors are integer ratios; our resamplers can't deal with fractional ratios
3304 // and I've never seen a non-corrupted JPEG file actually use them
3305 for (i=0; i < s->img_n; ++i) {
3306 if (h_max % z->img_comp[i].h != 0) return stbi__err("bad H","Corrupt JPEG");
3307 if (v_max % z->img_comp[i].v != 0) return stbi__err("bad V","Corrupt JPEG");
3310 // compute interleaved mcu info
3311 z->img_h_max = h_max;
3312 z->img_v_max = v_max;
3313 z->img_mcu_w = h_max * 8;
3314 z->img_mcu_h = v_max * 8;
3315 // these sizes can't be more than 17 bits
3316 z->img_mcu_x = (s->img_x + z->img_mcu_w-1) / z->img_mcu_w;
3317 z->img_mcu_y = (s->img_y + z->img_mcu_h-1) / z->img_mcu_h;
3319 for (i=0; i < s->img_n; ++i) {
3320 // number of effective pixels (e.g. for non-interleaved MCU)
3321 z->img_comp[i].x = (s->img_x * z->img_comp[i].h + h_max-1) / h_max;
3322 z->img_comp[i].y = (s->img_y * z->img_comp[i].v + v_max-1) / v_max;
3323 // to simplify generation, we'll allocate enough memory to decode
3324 // the bogus oversized data from using interleaved MCUs and their
3325 // big blocks (e.g. a 16x16 iMCU on an image of width 33); we won't
3326 // discard the extra data until colorspace conversion
3328 // img_mcu_x, img_mcu_y: <=17 bits; comp[i].h and .v are <=4 (checked earlier)
3329 // so these muls can't overflow with 32-bit ints (which we require)
3330 z->img_comp[i].w2 = z->img_mcu_x * z->img_comp[i].h * 8;
3331 z->img_comp[i].h2 = z->img_mcu_y * z->img_comp[i].v * 8;
3332 z->img_comp[i].coeff = 0;
3333 z->img_comp[i].raw_coeff = 0;
3334 z->img_comp[i].linebuf = NULL;
3335 z->img_comp[i].raw_data = stbi__malloc_mad2(z->img_comp[i].w2, z->img_comp[i].h2, 15);
3336 if (z->img_comp[i].raw_data == NULL)
3337 return stbi__free_jpeg_components(z, i+1, stbi__err("outofmem", "Out of memory"));
3338 // align blocks for idct using mmx/sse
3339 z->img_comp[i].data = (stbi_uc*) (((size_t) z->img_comp[i].raw_data + 15) & ~15);
3340 if (z->progressive) {
3341 // w2, h2 are multiples of 8 (see above)
3342 z->img_comp[i].coeff_w = z->img_comp[i].w2 / 8;
3343 z->img_comp[i].coeff_h = z->img_comp[i].h2 / 8;
3344 z->img_comp[i].raw_coeff = stbi__malloc_mad3(z->img_comp[i].w2, z->img_comp[i].h2, sizeof(short), 15);
3345 if (z->img_comp[i].raw_coeff == NULL)
3346 return stbi__free_jpeg_components(z, i+1, stbi__err("outofmem", "Out of memory"));
3347 z->img_comp[i].coeff = (short*) (((size_t) z->img_comp[i].raw_coeff + 15) & ~15);
3354 // use comparisons since in some cases we handle more than one case (e.g. SOF)
3355 #define stbi__DNL(x) ((x) == 0xdc)
3356 #define stbi__SOI(x) ((x) == 0xd8)
3357 #define stbi__EOI(x) ((x) == 0xd9)
3358 #define stbi__SOF(x) ((x) == 0xc0 || (x) == 0xc1 || (x) == 0xc2)
3359 #define stbi__SOS(x) ((x) == 0xda)
3361 #define stbi__SOF_progressive(x) ((x) == 0xc2)
3363 static int stbi__decode_jpeg_header(stbi__jpeg *z, int scan)
3367 z->app14_color_transform = -1; // valid values are 0,1,2
3368 z->marker = STBI__MARKER_none; // initialize cached marker to empty
3369 m = stbi__get_marker(z);
3370 if (!stbi__SOI(m)) return stbi__err("no SOI","Corrupt JPEG");
3371 if (scan == STBI__SCAN_type) return 1;
3372 m = stbi__get_marker(z);
3373 while (!stbi__SOF(m)) {
3374 if (!stbi__process_marker(z,m)) return 0;
3375 m = stbi__get_marker(z);
3376 while (m == STBI__MARKER_none) {
3377 // some files have extra padding after their blocks, so ok, we'll scan
3378 if (stbi__at_eof(z->s)) return stbi__err("no SOF", "Corrupt JPEG");
3379 m = stbi__get_marker(z);
3382 z->progressive = stbi__SOF_progressive(m);
3383 if (!stbi__process_frame_header(z, scan)) return 0;
3387 static int stbi__skip_jpeg_junk_at_end(stbi__jpeg *j)
3389 // some JPEGs have junk at end, skip over it but if we find what looks
3390 // like a valid marker, resume there
3391 while (!stbi__at_eof(j->s)) {
3392 int x = stbi__get8(j->s);
3393 while (x == 255) { // might be a marker
3394 if (stbi__at_eof(j->s)) return STBI__MARKER_none;
3395 x = stbi__get8(j->s);
3396 if (x != 0x00 && x != 0xff) {
3397 // not a stuffed zero or lead-in to another marker, looks
3398 // like an actual marker, return it
3401 // stuffed zero has x=0 now which ends the loop, meaning we go
3402 // back to regular scan loop.
3403 // repeated 0xff keeps trying to read the next byte of the marker.
3406 return STBI__MARKER_none;
3409 // decode image to YCbCr format
3410 static int stbi__decode_jpeg_image(stbi__jpeg *j)
3413 for (m = 0; m < 4; m++) {
3414 j->img_comp[m].raw_data = NULL;
3415 j->img_comp[m].raw_coeff = NULL;
3417 j->restart_interval = 0;
3418 if (!stbi__decode_jpeg_header(j, STBI__SCAN_load)) return 0;
3419 m = stbi__get_marker(j);
3420 while (!stbi__EOI(m)) {
3422 if (!stbi__process_scan_header(j)) return 0;
3423 if (!stbi__parse_entropy_coded_data(j)) return 0;
3424 if (j->marker == STBI__MARKER_none ) {
3425 j->marker = stbi__skip_jpeg_junk_at_end(j);
3426 // if we reach eof without hitting a marker, stbi__get_marker() below will fail and we'll eventually return 0
3428 m = stbi__get_marker(j);
3429 if (STBI__RESTART(m))
3430 m = stbi__get_marker(j);
3431 } else if (stbi__DNL(m)) {
3432 int Ld = stbi__get16be(j->s);
3433 stbi__uint32 NL = stbi__get16be(j->s);
3434 if (Ld != 4) return stbi__err("bad DNL len", "Corrupt JPEG");
3435 if (NL != j->s->img_y) return stbi__err("bad DNL height", "Corrupt JPEG");
3436 m = stbi__get_marker(j);
3438 if (!stbi__process_marker(j, m)) return 1;
3439 m = stbi__get_marker(j);
3443 stbi__jpeg_finish(j);
3447 // static jfif-centered resampling (across block boundaries)
3449 typedef stbi_uc *(*resample_row_func)(stbi_uc *out, stbi_uc *in0, stbi_uc *in1,
3452 #define stbi__div4(x) ((stbi_uc) ((x) >> 2))
3454 static stbi_uc *resample_row_1(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3457 STBI_NOTUSED(in_far);
3463 static stbi_uc* stbi__resample_row_v_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3465 // need to generate two samples vertically for every one in input
3468 for (i=0; i < w; ++i)
3469 out[i] = stbi__div4(3*in_near[i] + in_far[i] + 2);
3473 static stbi_uc* stbi__resample_row_h_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3475 // need to generate two samples horizontally for every one in input
3477 stbi_uc *input = in_near;
3480 // if only one sample, can't do any interpolation
3481 out[0] = out[1] = input[0];
3486 out[1] = stbi__div4(input[0]*3 + input[1] + 2);
3487 for (i=1; i < w-1; ++i) {
3488 int n = 3*input[i]+2;
3489 out[i*2+0] = stbi__div4(n+input[i-1]);
3490 out[i*2+1] = stbi__div4(n+input[i+1]);
3492 out[i*2+0] = stbi__div4(input[w-2]*3 + input[w-1] + 2);
3493 out[i*2+1] = input[w-1];
3495 STBI_NOTUSED(in_far);
3501 #define stbi__div16(x) ((stbi_uc) ((x) >> 4))
3503 static stbi_uc *stbi__resample_row_hv_2(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3505 // need to generate 2x2 samples for every one in input
3508 out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
3512 t1 = 3*in_near[0] + in_far[0];
3513 out[0] = stbi__div4(t1+2);
3514 for (i=1; i < w; ++i) {
3516 t1 = 3*in_near[i]+in_far[i];
3517 out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
3518 out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
3520 out[w*2-1] = stbi__div4(t1+2);
3527 #if defined(STBI_SSE2) || defined(STBI_NEON)
3528 static stbi_uc *stbi__resample_row_hv_2_simd(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3530 // need to generate 2x2 samples for every one in input
3534 out[0] = out[1] = stbi__div4(3*in_near[0] + in_far[0] + 2);
3538 t1 = 3*in_near[0] + in_far[0];
3539 // process groups of 8 pixels for as long as we can.
3540 // note we can't handle the last pixel in a row in this loop
3541 // because we need to handle the filter boundary conditions.
3542 for (; i < ((w-1) & ~7); i += 8) {
3543 #if defined(STBI_SSE2)
3544 // load and perform the vertical filtering pass
3545 // this uses 3*x + y = 4*x + (y - x)
3546 __m128i zero = _mm_setzero_si128();
3547 __m128i farb = _mm_loadl_epi64((__m128i *) (in_far + i));
3548 __m128i nearb = _mm_loadl_epi64((__m128i *) (in_near + i));
3549 __m128i farw = _mm_unpacklo_epi8(farb, zero);
3550 __m128i nearw = _mm_unpacklo_epi8(nearb, zero);
3551 __m128i diff = _mm_sub_epi16(farw, nearw);
3552 __m128i nears = _mm_slli_epi16(nearw, 2);
3553 __m128i curr = _mm_add_epi16(nears, diff); // current row
3555 // horizontal filter works the same based on shifted vers of current
3556 // row. "prev" is current row shifted right by 1 pixel; we need to
3557 // insert the previous pixel value (from t1).
3558 // "next" is current row shifted left by 1 pixel, with first pixel
3559 // of next block of 8 pixels added in.
3560 __m128i prv0 = _mm_slli_si128(curr, 2);
3561 __m128i nxt0 = _mm_srli_si128(curr, 2);
3562 __m128i prev = _mm_insert_epi16(prv0, t1, 0);
3563 __m128i next = _mm_insert_epi16(nxt0, 3*in_near[i+8] + in_far[i+8], 7);
3565 // horizontal filter, polyphase implementation since it's convenient:
3566 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
3567 // odd pixels = 3*cur + next = cur*4 + (next - cur)
3568 // note the shared term.
3569 __m128i bias = _mm_set1_epi16(8);
3570 __m128i curs = _mm_slli_epi16(curr, 2);
3571 __m128i prvd = _mm_sub_epi16(prev, curr);
3572 __m128i nxtd = _mm_sub_epi16(next, curr);
3573 __m128i curb = _mm_add_epi16(curs, bias);
3574 __m128i even = _mm_add_epi16(prvd, curb);
3575 __m128i odd = _mm_add_epi16(nxtd, curb);
3577 // interleave even and odd pixels, then undo scaling.
3578 __m128i int0 = _mm_unpacklo_epi16(even, odd);
3579 __m128i int1 = _mm_unpackhi_epi16(even, odd);
3580 __m128i de0 = _mm_srli_epi16(int0, 4);
3581 __m128i de1 = _mm_srli_epi16(int1, 4);
3583 // pack and write output
3584 __m128i outv = _mm_packus_epi16(de0, de1);
3585 _mm_storeu_si128((__m128i *) (out + i*2), outv);
3586 #elif defined(STBI_NEON)
3587 // load and perform the vertical filtering pass
3588 // this uses 3*x + y = 4*x + (y - x)
3589 uint8x8_t farb = vld1_u8(in_far + i);
3590 uint8x8_t nearb = vld1_u8(in_near + i);
3591 int16x8_t diff = vreinterpretq_s16_u16(vsubl_u8(farb, nearb));
3592 int16x8_t nears = vreinterpretq_s16_u16(vshll_n_u8(nearb, 2));
3593 int16x8_t curr = vaddq_s16(nears, diff); // current row
3595 // horizontal filter works the same based on shifted vers of current
3596 // row. "prev" is current row shifted right by 1 pixel; we need to
3597 // insert the previous pixel value (from t1).
3598 // "next" is current row shifted left by 1 pixel, with first pixel
3599 // of next block of 8 pixels added in.
3600 int16x8_t prv0 = vextq_s16(curr, curr, 7);
3601 int16x8_t nxt0 = vextq_s16(curr, curr, 1);
3602 int16x8_t prev = vsetq_lane_s16(t1, prv0, 0);
3603 int16x8_t next = vsetq_lane_s16(3*in_near[i+8] + in_far[i+8], nxt0, 7);
3605 // horizontal filter, polyphase implementation since it's convenient:
3606 // even pixels = 3*cur + prev = cur*4 + (prev - cur)
3607 // odd pixels = 3*cur + next = cur*4 + (next - cur)
3608 // note the shared term.
3609 int16x8_t curs = vshlq_n_s16(curr, 2);
3610 int16x8_t prvd = vsubq_s16(prev, curr);
3611 int16x8_t nxtd = vsubq_s16(next, curr);
3612 int16x8_t even = vaddq_s16(curs, prvd);
3613 int16x8_t odd = vaddq_s16(curs, nxtd);
3615 // undo scaling and round, then store with even/odd phases interleaved
3617 o.val[0] = vqrshrun_n_s16(even, 4);
3618 o.val[1] = vqrshrun_n_s16(odd, 4);
3619 vst2_u8(out + i*2, o);
3622 // "previous" value for next iter
3623 t1 = 3*in_near[i+7] + in_far[i+7];
3627 t1 = 3*in_near[i] + in_far[i];
3628 out[i*2] = stbi__div16(3*t1 + t0 + 8);
3630 for (++i; i < w; ++i) {
3632 t1 = 3*in_near[i]+in_far[i];
3633 out[i*2-1] = stbi__div16(3*t0 + t1 + 8);
3634 out[i*2 ] = stbi__div16(3*t1 + t0 + 8);
3636 out[w*2-1] = stbi__div4(t1+2);
3644 static stbi_uc *stbi__resample_row_generic(stbi_uc *out, stbi_uc *in_near, stbi_uc *in_far, int w, int hs)
3646 // resample with nearest-neighbor
3648 STBI_NOTUSED(in_far);
3649 for (i=0; i < w; ++i)
3650 for (j=0; j < hs; ++j)
3651 out[i*hs+j] = in_near[i];
3655 // this is a reduced-precision calculation of YCbCr-to-RGB introduced
3656 // to make sure the code produces the same results in both SIMD and scalar
3657 #define stbi__float2fixed(x) (((int) ((x) * 4096.0f + 0.5f)) << 8)
3658 static void stbi__YCbCr_to_RGB_row(stbi_uc *out, const stbi_uc *y, const stbi_uc *pcb, const stbi_uc *pcr, int count, int step)
3661 for (i=0; i < count; ++i) {
3662 int y_fixed = (y[i] << 20) + (1<<19); // rounding
3664 int cr = pcr[i] - 128;
3665 int cb = pcb[i] - 128;
3666 r = y_fixed + cr* stbi__float2fixed(1.40200f);
3667 g = y_fixed + (cr*-stbi__float2fixed(0.71414f)) + ((cb*-stbi__float2fixed(0.34414f)) & 0xffff0000);
3668 b = y_fixed + cb* stbi__float2fixed(1.77200f);
3672 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3673 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3674 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3675 out[0] = (stbi_uc)r;
3676 out[1] = (stbi_uc)g;
3677 out[2] = (stbi_uc)b;
3683 #if defined(STBI_SSE2) || defined(STBI_NEON)
3684 static void stbi__YCbCr_to_RGB_simd(stbi_uc *out, stbi_uc const *y, stbi_uc const *pcb, stbi_uc const *pcr, int count, int step)
3689 // step == 3 is pretty ugly on the final interleave, and i'm not convinced
3690 // it's useful in practice (you wouldn't use it for textures, for example).
3691 // so just accelerate step == 4 case.
3693 // this is a fairly straightforward implementation and not super-optimized.
3694 __m128i signflip = _mm_set1_epi8(-0x80);
3695 __m128i cr_const0 = _mm_set1_epi16( (short) ( 1.40200f*4096.0f+0.5f));
3696 __m128i cr_const1 = _mm_set1_epi16( - (short) ( 0.71414f*4096.0f+0.5f));
3697 __m128i cb_const0 = _mm_set1_epi16( - (short) ( 0.34414f*4096.0f+0.5f));
3698 __m128i cb_const1 = _mm_set1_epi16( (short) ( 1.77200f*4096.0f+0.5f));
3699 __m128i y_bias = _mm_set1_epi8((char) (unsigned char) 128);
3700 __m128i xw = _mm_set1_epi16(255); // alpha channel
3702 for (; i+7 < count; i += 8) {
3704 __m128i y_bytes = _mm_loadl_epi64((__m128i *) (y+i));
3705 __m128i cr_bytes = _mm_loadl_epi64((__m128i *) (pcr+i));
3706 __m128i cb_bytes = _mm_loadl_epi64((__m128i *) (pcb+i));
3707 __m128i cr_biased = _mm_xor_si128(cr_bytes, signflip); // -128
3708 __m128i cb_biased = _mm_xor_si128(cb_bytes, signflip); // -128
3710 // unpack to short (and left-shift cr, cb by 8)
3711 __m128i yw = _mm_unpacklo_epi8(y_bias, y_bytes);
3712 __m128i crw = _mm_unpacklo_epi8(_mm_setzero_si128(), cr_biased);
3713 __m128i cbw = _mm_unpacklo_epi8(_mm_setzero_si128(), cb_biased);
3716 __m128i yws = _mm_srli_epi16(yw, 4);
3717 __m128i cr0 = _mm_mulhi_epi16(cr_const0, crw);
3718 __m128i cb0 = _mm_mulhi_epi16(cb_const0, cbw);
3719 __m128i cb1 = _mm_mulhi_epi16(cbw, cb_const1);
3720 __m128i cr1 = _mm_mulhi_epi16(crw, cr_const1);
3721 __m128i rws = _mm_add_epi16(cr0, yws);
3722 __m128i gwt = _mm_add_epi16(cb0, yws);
3723 __m128i bws = _mm_add_epi16(yws, cb1);
3724 __m128i gws = _mm_add_epi16(gwt, cr1);
3727 __m128i rw = _mm_srai_epi16(rws, 4);
3728 __m128i bw = _mm_srai_epi16(bws, 4);
3729 __m128i gw = _mm_srai_epi16(gws, 4);
3731 // back to byte, set up for transpose
3732 __m128i brb = _mm_packus_epi16(rw, bw);
3733 __m128i gxb = _mm_packus_epi16(gw, xw);
3735 // transpose to interleave channels
3736 __m128i t0 = _mm_unpacklo_epi8(brb, gxb);
3737 __m128i t1 = _mm_unpackhi_epi8(brb, gxb);
3738 __m128i o0 = _mm_unpacklo_epi16(t0, t1);
3739 __m128i o1 = _mm_unpackhi_epi16(t0, t1);
3742 _mm_storeu_si128((__m128i *) (out + 0), o0);
3743 _mm_storeu_si128((__m128i *) (out + 16), o1);
3750 // in this version, step=3 support would be easy to add. but is there demand?
3752 // this is a fairly straightforward implementation and not super-optimized.
3753 uint8x8_t signflip = vdup_n_u8(0x80);
3754 int16x8_t cr_const0 = vdupq_n_s16( (short) ( 1.40200f*4096.0f+0.5f));
3755 int16x8_t cr_const1 = vdupq_n_s16( - (short) ( 0.71414f*4096.0f+0.5f));
3756 int16x8_t cb_const0 = vdupq_n_s16( - (short) ( 0.34414f*4096.0f+0.5f));
3757 int16x8_t cb_const1 = vdupq_n_s16( (short) ( 1.77200f*4096.0f+0.5f));
3759 for (; i+7 < count; i += 8) {
3761 uint8x8_t y_bytes = vld1_u8(y + i);
3762 uint8x8_t cr_bytes = vld1_u8(pcr + i);
3763 uint8x8_t cb_bytes = vld1_u8(pcb + i);
3764 int8x8_t cr_biased = vreinterpret_s8_u8(vsub_u8(cr_bytes, signflip));
3765 int8x8_t cb_biased = vreinterpret_s8_u8(vsub_u8(cb_bytes, signflip));
3768 int16x8_t yws = vreinterpretq_s16_u16(vshll_n_u8(y_bytes, 4));
3769 int16x8_t crw = vshll_n_s8(cr_biased, 7);
3770 int16x8_t cbw = vshll_n_s8(cb_biased, 7);
3773 int16x8_t cr0 = vqdmulhq_s16(crw, cr_const0);
3774 int16x8_t cb0 = vqdmulhq_s16(cbw, cb_const0);
3775 int16x8_t cr1 = vqdmulhq_s16(crw, cr_const1);
3776 int16x8_t cb1 = vqdmulhq_s16(cbw, cb_const1);
3777 int16x8_t rws = vaddq_s16(yws, cr0);
3778 int16x8_t gws = vaddq_s16(vaddq_s16(yws, cb0), cr1);
3779 int16x8_t bws = vaddq_s16(yws, cb1);
3781 // undo scaling, round, convert to byte
3783 o.val[0] = vqrshrun_n_s16(rws, 4);
3784 o.val[1] = vqrshrun_n_s16(gws, 4);
3785 o.val[2] = vqrshrun_n_s16(bws, 4);
3786 o.val[3] = vdup_n_u8(255);
3788 // store, interleaving r/g/b/a
3795 for (; i < count; ++i) {
3796 int y_fixed = (y[i] << 20) + (1<<19); // rounding
3798 int cr = pcr[i] - 128;
3799 int cb = pcb[i] - 128;
3800 r = y_fixed + cr* stbi__float2fixed(1.40200f);
3801 g = y_fixed + cr*-stbi__float2fixed(0.71414f) + ((cb*-stbi__float2fixed(0.34414f)) & 0xffff0000);
3802 b = y_fixed + cb* stbi__float2fixed(1.77200f);
3806 if ((unsigned) r > 255) { if (r < 0) r = 0; else r = 255; }
3807 if ((unsigned) g > 255) { if (g < 0) g = 0; else g = 255; }
3808 if ((unsigned) b > 255) { if (b < 0) b = 0; else b = 255; }
3809 out[0] = (stbi_uc)r;
3810 out[1] = (stbi_uc)g;
3811 out[2] = (stbi_uc)b;
3818 // set up the kernels
3819 static void stbi__setup_jpeg(stbi__jpeg *j)
3821 j->idct_block_kernel = stbi__idct_block;
3822 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_row;
3823 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2;
3826 if (stbi__sse2_available()) {
3827 j->idct_block_kernel = stbi__idct_simd;
3828 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3829 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3834 j->idct_block_kernel = stbi__idct_simd;
3835 j->YCbCr_to_RGB_kernel = stbi__YCbCr_to_RGB_simd;
3836 j->resample_row_hv_2_kernel = stbi__resample_row_hv_2_simd;
3840 // clean up the temporary component buffers
3841 static void stbi__cleanup_jpeg(stbi__jpeg *j)
3843 stbi__free_jpeg_components(j, j->s->img_n, 0);
3848 resample_row_func resample;
3849 stbi_uc *line0,*line1;
3850 int hs,vs; // expansion factor in each axis
3851 int w_lores; // horizontal pixels pre-expansion
3852 int ystep; // how far through vertical expansion we are
3853 int ypos; // which pre-expansion row we're on
3856 // fast 0..255 * 0..255 => 0..255 rounded multiplication
3857 static stbi_uc stbi__blinn_8x8(stbi_uc x, stbi_uc y)
3859 unsigned int t = x*y + 128;
3860 return (stbi_uc) ((t + (t >>8)) >> 8);
3863 static stbi_uc *load_jpeg_image(stbi__jpeg *z, int *out_x, int *out_y, int *comp, int req_comp)
3865 int n, decode_n, is_rgb;
3866 z->s->img_n = 0; // make stbi__cleanup_jpeg safe
3868 // validate req_comp
3869 if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
3871 // load a jpeg image from whichever source, but leave in YCbCr format
3872 if (!stbi__decode_jpeg_image(z)) { stbi__cleanup_jpeg(z); return NULL; }
3874 // determine actual number of components to generate
3875 n = req_comp ? req_comp : z->s->img_n >= 3 ? 3 : 1;
3877 is_rgb = z->s->img_n == 3 && (z->rgb == 3 || (z->app14_color_transform == 0 && !z->jfif));
3879 if (z->s->img_n == 3 && n < 3 && !is_rgb)
3882 decode_n = z->s->img_n;
3884 // nothing to do if no components requested; check this now to avoid
3885 // accessing uninitialized coutput[0] later
3886 if (decode_n <= 0) { stbi__cleanup_jpeg(z); return NULL; }
3888 // resample and color-convert
3893 stbi_uc *coutput[4] = { NULL, NULL, NULL, NULL };
3895 stbi__resample res_comp[4];
3897 for (k=0; k < decode_n; ++k) {
3898 stbi__resample *r = &res_comp[k];
3900 // allocate line buffer big enough for upsampling off the edges
3901 // with upsample factor of 4
3902 z->img_comp[k].linebuf = (stbi_uc *) stbi__malloc(z->s->img_x + 3);
3903 if (!z->img_comp[k].linebuf) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3905 r->hs = z->img_h_max / z->img_comp[k].h;
3906 r->vs = z->img_v_max / z->img_comp[k].v;
3907 r->ystep = r->vs >> 1;
3908 r->w_lores = (z->s->img_x + r->hs-1) / r->hs;
3910 r->line0 = r->line1 = z->img_comp[k].data;
3912 if (r->hs == 1 && r->vs == 1) r->resample = resample_row_1;
3913 else if (r->hs == 1 && r->vs == 2) r->resample = stbi__resample_row_v_2;
3914 else if (r->hs == 2 && r->vs == 1) r->resample = stbi__resample_row_h_2;
3915 else if (r->hs == 2 && r->vs == 2) r->resample = z->resample_row_hv_2_kernel;
3916 else r->resample = stbi__resample_row_generic;
3919 // can't error after this so, this is safe
3920 output = (stbi_uc *) stbi__malloc_mad3(n, z->s->img_x, z->s->img_y, 1);
3921 if (!output) { stbi__cleanup_jpeg(z); return stbi__errpuc("outofmem", "Out of memory"); }
3923 // now go ahead and resample
3924 for (j=0; j < z->s->img_y; ++j) {
3925 stbi_uc *out = output + n * z->s->img_x * j;
3926 for (k=0; k < decode_n; ++k) {
3927 stbi__resample *r = &res_comp[k];
3928 int y_bot = r->ystep >= (r->vs >> 1);
3929 coutput[k] = r->resample(z->img_comp[k].linebuf,
3930 y_bot ? r->line1 : r->line0,
3931 y_bot ? r->line0 : r->line1,
3933 if (++r->ystep >= r->vs) {
3935 r->line0 = r->line1;
3936 if (++r->ypos < z->img_comp[k].y)
3937 r->line1 += z->img_comp[k].w2;
3941 stbi_uc *y = coutput[0];
3942 if (z->s->img_n == 3) {
3944 for (i=0; i < z->s->img_x; ++i) {
3946 out[1] = coutput[1][i];
3947 out[2] = coutput[2][i];
3952 z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3954 } else if (z->s->img_n == 4) {
3955 if (z->app14_color_transform == 0) { // CMYK
3956 for (i=0; i < z->s->img_x; ++i) {
3957 stbi_uc m = coutput[3][i];
3958 out[0] = stbi__blinn_8x8(coutput[0][i], m);
3959 out[1] = stbi__blinn_8x8(coutput[1][i], m);
3960 out[2] = stbi__blinn_8x8(coutput[2][i], m);
3964 } else if (z->app14_color_transform == 2) { // YCCK
3965 z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3966 for (i=0; i < z->s->img_x; ++i) {
3967 stbi_uc m = coutput[3][i];
3968 out[0] = stbi__blinn_8x8(255 - out[0], m);
3969 out[1] = stbi__blinn_8x8(255 - out[1], m);
3970 out[2] = stbi__blinn_8x8(255 - out[2], m);
3973 } else { // YCbCr + alpha? Ignore the fourth channel for now
3974 z->YCbCr_to_RGB_kernel(out, y, coutput[1], coutput[2], z->s->img_x, n);
3977 for (i=0; i < z->s->img_x; ++i) {
3978 out[0] = out[1] = out[2] = y[i];
3979 out[3] = 255; // not used if n==3
3985 for (i=0; i < z->s->img_x; ++i)
3986 *out++ = stbi__compute_y(coutput[0][i], coutput[1][i], coutput[2][i]);
3988 for (i=0; i < z->s->img_x; ++i, out += 2) {
3989 out[0] = stbi__compute_y(coutput[0][i], coutput[1][i], coutput[2][i]);
3993 } else if (z->s->img_n == 4 && z->app14_color_transform == 0) {
3994 for (i=0; i < z->s->img_x; ++i) {
3995 stbi_uc m = coutput[3][i];
3996 stbi_uc r = stbi__blinn_8x8(coutput[0][i], m);
3997 stbi_uc g = stbi__blinn_8x8(coutput[1][i], m);
3998 stbi_uc b = stbi__blinn_8x8(coutput[2][i], m);
3999 out[0] = stbi__compute_y(r, g, b);
4003 } else if (z->s->img_n == 4 && z->app14_color_transform == 2) {
4004 for (i=0; i < z->s->img_x; ++i) {
4005 out[0] = stbi__blinn_8x8(255 - coutput[0][i], coutput[3][i]);
4010 stbi_uc *y = coutput[0];
4012 for (i=0; i < z->s->img_x; ++i) out[i] = y[i];
4014 for (i=0; i < z->s->img_x; ++i) { *out++ = y[i]; *out++ = 255; }
4018 stbi__cleanup_jpeg(z);
4019 *out_x = z->s->img_x;
4020 *out_y = z->s->img_y;
4021 if (comp) *comp = z->s->img_n >= 3 ? 3 : 1; // report original components, not output
4026 static void *stbi__jpeg_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
4028 unsigned char* result;
4029 stbi__jpeg* j = (stbi__jpeg*) stbi__malloc(sizeof(stbi__jpeg));
4030 if (!j) return stbi__errpuc("outofmem", "Out of memory");
4031 memset(j, 0, sizeof(stbi__jpeg));
4034 stbi__setup_jpeg(j);
4035 result = load_jpeg_image(j, x,y,comp,req_comp);
4040 static int stbi__jpeg_test(stbi__context *s)
4043 stbi__jpeg* j = (stbi__jpeg*)stbi__malloc(sizeof(stbi__jpeg));
4044 if (!j) return stbi__err("outofmem", "Out of memory");
4045 memset(j, 0, sizeof(stbi__jpeg));
4047 stbi__setup_jpeg(j);
4048 r = stbi__decode_jpeg_header(j, STBI__SCAN_type);
4054 static int stbi__jpeg_info_raw(stbi__jpeg *j, int *x, int *y, int *comp)
4056 if (!stbi__decode_jpeg_header(j, STBI__SCAN_header)) {
4057 stbi__rewind( j->s );
4060 if (x) *x = j->s->img_x;
4061 if (y) *y = j->s->img_y;
4062 if (comp) *comp = j->s->img_n >= 3 ? 3 : 1;
4066 static int stbi__jpeg_info(stbi__context *s, int *x, int *y, int *comp)
4069 stbi__jpeg* j = (stbi__jpeg*) (stbi__malloc(sizeof(stbi__jpeg)));
4070 if (!j) return stbi__err("outofmem", "Out of memory");
4071 memset(j, 0, sizeof(stbi__jpeg));
4073 result = stbi__jpeg_info_raw(j, x, y, comp);
4079 // public domain zlib decode v0.2 Sean Barrett 2006-11-18
4080 // simple implementation
4081 // - all input must be provided in an upfront buffer
4082 // - all output is written to a single output buffer (can malloc/realloc)
4086 #ifndef STBI_NO_ZLIB
4088 // fast-way is faster to check than jpeg huffman, but slow way is slower
4089 #define STBI__ZFAST_BITS 9 // accelerate all cases in default tables
4090 #define STBI__ZFAST_MASK ((1 << STBI__ZFAST_BITS) - 1)
4091 #define STBI__ZNSYMS 288 // number of symbols in literal/length alphabet
4093 // zlib-style huffman encoding
4094 // (jpegs packs from left, zlib from right, so can't share code)
4097 stbi__uint16 fast[1 << STBI__ZFAST_BITS];
4098 stbi__uint16 firstcode[16];
4100 stbi__uint16 firstsymbol[16];
4101 stbi_uc size[STBI__ZNSYMS];
4102 stbi__uint16 value[STBI__ZNSYMS];
4105 stbi_inline static int stbi__bitreverse16(int n)
4107 n = ((n & 0xAAAA) >> 1) | ((n & 0x5555) << 1);
4108 n = ((n & 0xCCCC) >> 2) | ((n & 0x3333) << 2);
4109 n = ((n & 0xF0F0) >> 4) | ((n & 0x0F0F) << 4);
4110 n = ((n & 0xFF00) >> 8) | ((n & 0x00FF) << 8);
4114 stbi_inline static int stbi__bit_reverse(int v, int bits)
4116 STBI_ASSERT(bits <= 16);
4117 // to bit reverse n bits, reverse 16 and shift
4118 // e.g. 11 bits, bit reverse and shift away 5
4119 return stbi__bitreverse16(v) >> (16-bits);
4122 static int stbi__zbuild_huffman(stbi__zhuffman *z, const stbi_uc *sizelist, int num)
4125 int code, next_code[16], sizes[17];
4127 // DEFLATE spec for generating codes
4128 memset(sizes, 0, sizeof(sizes));
4129 memset(z->fast, 0, sizeof(z->fast));
4130 for (i=0; i < num; ++i)
4131 ++sizes[sizelist[i]];
4133 for (i=1; i < 16; ++i)
4134 if (sizes[i] > (1 << i))
4135 return stbi__err("bad sizes", "Corrupt PNG");
4137 for (i=1; i < 16; ++i) {
4138 next_code[i] = code;
4139 z->firstcode[i] = (stbi__uint16) code;
4140 z->firstsymbol[i] = (stbi__uint16) k;
4141 code = (code + sizes[i]);
4143 if (code-1 >= (1 << i)) return stbi__err("bad codelengths","Corrupt PNG");
4144 z->maxcode[i] = code << (16-i); // preshift for inner loop
4148 z->maxcode[16] = 0x10000; // sentinel
4149 for (i=0; i < num; ++i) {
4150 int s = sizelist[i];
4152 int c = next_code[s] - z->firstcode[s] + z->firstsymbol[s];
4153 stbi__uint16 fastv = (stbi__uint16) ((s << 9) | i);
4154 z->size [c] = (stbi_uc ) s;
4155 z->value[c] = (stbi__uint16) i;
4156 if (s <= STBI__ZFAST_BITS) {
4157 int j = stbi__bit_reverse(next_code[s],s);
4158 while (j < (1 << STBI__ZFAST_BITS)) {
4169 // zlib-from-memory implementation for PNG reading
4170 // because PNG allows splitting the zlib stream arbitrarily,
4171 // and it's annoying structurally to have PNG call ZLIB call PNG,
4172 // we require PNG read all the IDATs and combine them into a single
4177 stbi_uc *zbuffer, *zbuffer_end;
4179 stbi__uint32 code_buffer;
4186 stbi__zhuffman z_length, z_distance;
4189 stbi_inline static int stbi__zeof(stbi__zbuf *z)
4191 return (z->zbuffer >= z->zbuffer_end);
4194 stbi_inline static stbi_uc stbi__zget8(stbi__zbuf *z)
4196 return stbi__zeof(z) ? 0 : *z->zbuffer++;
4199 static void stbi__fill_bits(stbi__zbuf *z)
4202 if (z->code_buffer >= (1U << z->num_bits)) {
4203 z->zbuffer = z->zbuffer_end; /* treat this as EOF so we fail. */
4206 z->code_buffer |= (unsigned int) stbi__zget8(z) << z->num_bits;
4208 } while (z->num_bits <= 24);
4211 stbi_inline static unsigned int stbi__zreceive(stbi__zbuf *z, int n)
4214 if (z->num_bits < n) stbi__fill_bits(z);
4215 k = z->code_buffer & ((1 << n) - 1);
4216 z->code_buffer >>= n;
4221 static int stbi__zhuffman_decode_slowpath(stbi__zbuf *a, stbi__zhuffman *z)
4224 // not resolved by fast table, so compute it the slow way
4225 // use jpeg approach, which requires MSbits at top
4226 k = stbi__bit_reverse(a->code_buffer, 16);
4227 for (s=STBI__ZFAST_BITS+1; ; ++s)
4228 if (k < z->maxcode[s])
4230 if (s >= 16) return -1; // invalid code!
4231 // code size is s, so:
4232 b = (k >> (16-s)) - z->firstcode[s] + z->firstsymbol[s];
4233 if (b >= STBI__ZNSYMS) return -1; // some data was corrupt somewhere!
4234 if (z->size[b] != s) return -1; // was originally an assert, but report failure instead.
4235 a->code_buffer >>= s;
4240 stbi_inline static int stbi__zhuffman_decode(stbi__zbuf *a, stbi__zhuffman *z)
4243 if (a->num_bits < 16) {
4244 if (stbi__zeof(a)) {
4245 return -1; /* report error for unexpected end of data. */
4249 b = z->fast[a->code_buffer & STBI__ZFAST_MASK];
4252 a->code_buffer >>= s;
4256 return stbi__zhuffman_decode_slowpath(a, z);
4259 static int stbi__zexpand(stbi__zbuf *z, char *zout, int n) // need to make room for n bytes
4262 unsigned int cur, limit, old_limit;
4264 if (!z->z_expandable) return stbi__err("output buffer limit","Corrupt PNG");
4265 cur = (unsigned int) (z->zout - z->zout_start);
4266 limit = old_limit = (unsigned) (z->zout_end - z->zout_start);
4267 if (UINT_MAX - cur < (unsigned) n) return stbi__err("outofmem", "Out of memory");
4268 while (cur + n > limit) {
4269 if(limit > UINT_MAX / 2) return stbi__err("outofmem", "Out of memory");
4272 q = (char *) STBI_REALLOC_SIZED(z->zout_start, old_limit, limit);
4273 STBI_NOTUSED(old_limit);
4274 if (q == NULL) return stbi__err("outofmem", "Out of memory");
4277 z->zout_end = q + limit;
4281 static const int stbi__zlength_base[31] = {
4282 3,4,5,6,7,8,9,10,11,13,
4283 15,17,19,23,27,31,35,43,51,59,
4284 67,83,99,115,131,163,195,227,258,0,0 };
4286 static const int stbi__zlength_extra[31]=
4287 { 0,0,0,0,0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,0,0,0 };
4289 static const int stbi__zdist_base[32] = { 1,2,3,4,5,7,9,13,17,25,33,49,65,97,129,193,
4290 257,385,513,769,1025,1537,2049,3073,4097,6145,8193,12289,16385,24577,0,0};
4292 static const int stbi__zdist_extra[32] =
4293 { 0,0,0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13};
4295 static int stbi__parse_huffman_block(stbi__zbuf *a)
4297 char *zout = a->zout;
4299 int z = stbi__zhuffman_decode(a, &a->z_length);
4301 if (z < 0) return stbi__err("bad huffman code","Corrupt PNG"); // error in huffman codes
4302 if (zout >= a->zout_end) {
4303 if (!stbi__zexpand(a, zout, 1)) return 0;
4314 if (z >= 286) return stbi__err("bad huffman code","Corrupt PNG"); // per DEFLATE, length codes 286 and 287 must not appear in compressed data
4316 len = stbi__zlength_base[z];
4317 if (stbi__zlength_extra[z]) len += stbi__zreceive(a, stbi__zlength_extra[z]);
4318 z = stbi__zhuffman_decode(a, &a->z_distance);
4319 if (z < 0 || z >= 30) return stbi__err("bad huffman code","Corrupt PNG"); // per DEFLATE, distance codes 30 and 31 must not appear in compressed data
4320 dist = stbi__zdist_base[z];
4321 if (stbi__zdist_extra[z]) dist += stbi__zreceive(a, stbi__zdist_extra[z]);
4322 if (zout - a->zout_start < dist) return stbi__err("bad dist","Corrupt PNG");
4323 if (zout + len > a->zout_end) {
4324 if (!stbi__zexpand(a, zout, len)) return 0;
4327 p = (stbi_uc *) (zout - dist);
4328 if (dist == 1) { // run of one byte; common in images.
4330 if (len) { do *zout++ = v; while (--len); }
4332 if (len) { do *zout++ = *p++; while (--len); }
4338 static int stbi__compute_huffman_codes(stbi__zbuf *a)
4340 static const stbi_uc length_dezigzag[19] = { 16,17,18,0,8,7,9,6,10,5,11,4,12,3,13,2,14,1,15 };
4341 stbi__zhuffman z_codelength;
4342 stbi_uc lencodes[286+32+137];//padding for maximum single op
4343 stbi_uc codelength_sizes[19];
4346 int hlit = stbi__zreceive(a,5) + 257;
4347 int hdist = stbi__zreceive(a,5) + 1;
4348 int hclen = stbi__zreceive(a,4) + 4;
4349 int ntot = hlit + hdist;
4351 memset(codelength_sizes, 0, sizeof(codelength_sizes));
4352 for (i=0; i < hclen; ++i) {
4353 int s = stbi__zreceive(a,3);
4354 codelength_sizes[length_dezigzag[i]] = (stbi_uc) s;
4356 if (!stbi__zbuild_huffman(&z_codelength, codelength_sizes, 19)) return 0;
4360 int c = stbi__zhuffman_decode(a, &z_codelength);
4361 if (c < 0 || c >= 19) return stbi__err("bad codelengths", "Corrupt PNG");
4363 lencodes[n++] = (stbi_uc) c;
4367 c = stbi__zreceive(a,2)+3;
4368 if (n == 0) return stbi__err("bad codelengths", "Corrupt PNG");
4369 fill = lencodes[n-1];
4370 } else if (c == 17) {
4371 c = stbi__zreceive(a,3)+3;
4372 } else if (c == 18) {
4373 c = stbi__zreceive(a,7)+11;
4375 return stbi__err("bad codelengths", "Corrupt PNG");
4377 if (ntot - n < c) return stbi__err("bad codelengths", "Corrupt PNG");
4378 memset(lencodes+n, fill, c);
4382 if (n != ntot) return stbi__err("bad codelengths","Corrupt PNG");
4383 if (!stbi__zbuild_huffman(&a->z_length, lencodes, hlit)) return 0;
4384 if (!stbi__zbuild_huffman(&a->z_distance, lencodes+hlit, hdist)) return 0;
4388 static int stbi__parse_uncompressed_block(stbi__zbuf *a)
4392 if (a->num_bits & 7)
4393 stbi__zreceive(a, a->num_bits & 7); // discard
4394 // drain the bit-packed data into header
4396 while (a->num_bits > 0) {
4397 header[k++] = (stbi_uc) (a->code_buffer & 255); // suppress MSVC run-time check
4398 a->code_buffer >>= 8;
4401 if (a->num_bits < 0) return stbi__err("zlib corrupt","Corrupt PNG");
4402 // now fill header the normal way
4404 header[k++] = stbi__zget8(a);
4405 len = header[1] * 256 + header[0];
4406 nlen = header[3] * 256 + header[2];
4407 if (nlen != (len ^ 0xffff)) return stbi__err("zlib corrupt","Corrupt PNG");
4408 if (a->zbuffer + len > a->zbuffer_end) return stbi__err("read past buffer","Corrupt PNG");
4409 if (a->zout + len > a->zout_end)
4410 if (!stbi__zexpand(a, a->zout, len)) return 0;
4411 memcpy(a->zout, a->zbuffer, len);
4417 static int stbi__parse_zlib_header(stbi__zbuf *a)
4419 int cmf = stbi__zget8(a);
4421 /* int cinfo = cmf >> 4; */
4422 int flg = stbi__zget8(a);
4423 if (stbi__zeof(a)) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
4424 if ((cmf*256+flg) % 31 != 0) return stbi__err("bad zlib header","Corrupt PNG"); // zlib spec
4425 if (flg & 32) return stbi__err("no preset dict","Corrupt PNG"); // preset dictionary not allowed in png
4426 if (cm != 8) return stbi__err("bad compression","Corrupt PNG"); // DEFLATE required for png
4427 // window = 1 << (8 + cinfo)... but who cares, we fully buffer output
4431 static const stbi_uc stbi__zdefault_length[STBI__ZNSYMS] =
4433 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
4434 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
4435 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
4436 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,
4437 8,8,8,8,8,8,8,8,8,8,8,8,8,8,8,8, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
4438 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
4439 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
4440 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9, 9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,9,
4441 7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7, 7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8
4443 static const stbi_uc stbi__zdefault_distance[32] =
4445 5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5
4450 int i; // use <= to match clearly with spec
4451 for (i=0; i <= 143; ++i) stbi__zdefault_length[i] = 8;
4452 for ( ; i <= 255; ++i) stbi__zdefault_length[i] = 9;
4453 for ( ; i <= 279; ++i) stbi__zdefault_length[i] = 7;
4454 for ( ; i <= 287; ++i) stbi__zdefault_length[i] = 8;
4456 for (i=0; i <= 31; ++i) stbi__zdefault_distance[i] = 5;
4460 static int stbi__parse_zlib(stbi__zbuf *a, int parse_header)
4464 if (!stbi__parse_zlib_header(a)) return 0;
4468 final = stbi__zreceive(a,1);
4469 type = stbi__zreceive(a,2);
4471 if (!stbi__parse_uncompressed_block(a)) return 0;
4472 } else if (type == 3) {
4476 // use fixed code lengths
4477 if (!stbi__zbuild_huffman(&a->z_length , stbi__zdefault_length , STBI__ZNSYMS)) return 0;
4478 if (!stbi__zbuild_huffman(&a->z_distance, stbi__zdefault_distance, 32)) return 0;
4480 if (!stbi__compute_huffman_codes(a)) return 0;
4482 if (!stbi__parse_huffman_block(a)) return 0;
4488 static int stbi__do_zlib(stbi__zbuf *a, char *obuf, int olen, int exp, int parse_header)
4490 a->zout_start = obuf;
4492 a->zout_end = obuf + olen;
4493 a->z_expandable = exp;
4495 return stbi__parse_zlib(a, parse_header);
4498 STBIDEF char *stbi_zlib_decode_malloc_guesssize(const char *buffer, int len, int initial_size, int *outlen)
4501 char *p = (char *) stbi__malloc(initial_size);
4502 if (p == NULL) return NULL;
4503 a.zbuffer = (stbi_uc *) buffer;
4504 a.zbuffer_end = (stbi_uc *) buffer + len;
4505 if (stbi__do_zlib(&a, p, initial_size, 1, 1)) {
4506 if (outlen) *outlen = (int) (a.zout - a.zout_start);
4507 return a.zout_start;
4509 STBI_FREE(a.zout_start);
4514 STBIDEF char *stbi_zlib_decode_malloc(char const *buffer, int len, int *outlen)
4516 return stbi_zlib_decode_malloc_guesssize(buffer, len, 16384, outlen);
4519 STBIDEF char *stbi_zlib_decode_malloc_guesssize_headerflag(const char *buffer, int len, int initial_size, int *outlen, int parse_header)
4522 char *p = (char *) stbi__malloc(initial_size);
4523 if (p == NULL) return NULL;
4524 a.zbuffer = (stbi_uc *) buffer;
4525 a.zbuffer_end = (stbi_uc *) buffer + len;
4526 if (stbi__do_zlib(&a, p, initial_size, 1, parse_header)) {
4527 if (outlen) *outlen = (int) (a.zout - a.zout_start);
4528 return a.zout_start;
4530 STBI_FREE(a.zout_start);
4535 STBIDEF int stbi_zlib_decode_buffer(char *obuffer, int olen, char const *ibuffer, int ilen)
4538 a.zbuffer = (stbi_uc *) ibuffer;
4539 a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
4540 if (stbi__do_zlib(&a, obuffer, olen, 0, 1))
4541 return (int) (a.zout - a.zout_start);
4546 STBIDEF char *stbi_zlib_decode_noheader_malloc(char const *buffer, int len, int *outlen)
4549 char *p = (char *) stbi__malloc(16384);
4550 if (p == NULL) return NULL;
4551 a.zbuffer = (stbi_uc *) buffer;
4552 a.zbuffer_end = (stbi_uc *) buffer+len;
4553 if (stbi__do_zlib(&a, p, 16384, 1, 0)) {
4554 if (outlen) *outlen = (int) (a.zout - a.zout_start);
4555 return a.zout_start;
4557 STBI_FREE(a.zout_start);
4562 STBIDEF int stbi_zlib_decode_noheader_buffer(char *obuffer, int olen, const char *ibuffer, int ilen)
4565 a.zbuffer = (stbi_uc *) ibuffer;
4566 a.zbuffer_end = (stbi_uc *) ibuffer + ilen;
4567 if (stbi__do_zlib(&a, obuffer, olen, 0, 0))
4568 return (int) (a.zout - a.zout_start);
4574 // public domain "baseline" PNG decoder v0.10 Sean Barrett 2006-11-18
4575 // simple implementation
4576 // - only 8-bit samples
4577 // - no CRC checking
4578 // - allocates lots of intermediate memory
4579 // - avoids problem of streaming data between subsystems
4580 // - avoids explicit window management
4582 // - uses stb_zlib, a PD zlib implementation with fast huffman decoding
4587 stbi__uint32 length;
4591 static stbi__pngchunk stbi__get_chunk_header(stbi__context *s)
4594 c.length = stbi__get32be(s);
4595 c.type = stbi__get32be(s);
4599 static int stbi__check_png_header(stbi__context *s)
4601 static const stbi_uc png_sig[8] = { 137,80,78,71,13,10,26,10 };
4603 for (i=0; i < 8; ++i)
4604 if (stbi__get8(s) != png_sig[i]) return stbi__err("bad png sig","Not a PNG");
4611 stbi_uc *idata, *expanded, *out;
4622 // synthetic filters used for first scanline to avoid needing a dummy row of 0s
4627 static stbi_uc first_row_filter[5] =
4636 static int stbi__paeth(int a, int b, int c)
4642 if (pa <= pb && pa <= pc) return a;
4643 if (pb <= pc) return b;
4647 static const stbi_uc stbi__depth_scale_table[9] = { 0, 0xff, 0x55, 0, 0x11, 0,0,0, 0x01 };
4649 // create the png data from post-deflated data
4650 static int stbi__create_png_image_raw(stbi__png *a, stbi_uc *raw, stbi__uint32 raw_len, int out_n, stbi__uint32 x, stbi__uint32 y, int depth, int color)
4652 int bytes = (depth == 16? 2 : 1);
4653 stbi__context *s = a->s;
4654 stbi__uint32 i,j,stride = x*out_n*bytes;
4655 stbi__uint32 img_len, img_width_bytes;
4657 int img_n = s->img_n; // copy it into a local for later
4659 int output_bytes = out_n*bytes;
4660 int filter_bytes = img_n*bytes;
4663 STBI_ASSERT(out_n == s->img_n || out_n == s->img_n+1);
4664 a->out = (stbi_uc *) stbi__malloc_mad3(x, y, output_bytes, 0); // extra bytes to write off the end into
4665 if (!a->out) return stbi__err("outofmem", "Out of memory");
4667 if (!stbi__mad3sizes_valid(img_n, x, depth, 7)) return stbi__err("too large", "Corrupt PNG");
4668 img_width_bytes = (((img_n * x * depth) + 7) >> 3);
4669 img_len = (img_width_bytes + 1) * y;
4671 // we used to check for exact match between raw_len and img_len on non-interlaced PNGs,
4672 // but issue #276 reported a PNG in the wild that had extra data at the end (all zeros),
4673 // so just check for raw_len < img_len always.
4674 if (raw_len < img_len) return stbi__err("not enough pixels","Corrupt PNG");
4676 for (j=0; j < y; ++j) {
4677 stbi_uc *cur = a->out + stride*j;
4679 int filter = *raw++;
4682 return stbi__err("invalid filter","Corrupt PNG");
4685 if (img_width_bytes > x) return stbi__err("invalid width","Corrupt PNG");
4686 cur += x*out_n - img_width_bytes; // store output to the rightmost img_len bytes, so we can decode in place
4688 width = img_width_bytes;
4690 prior = cur - stride; // bugfix: need to compute this after 'cur +=' computation above
4692 // if first row, use special filter that doesn't sample previous row
4693 if (j == 0) filter = first_row_filter[filter];
4695 // handle first byte explicitly
4696 for (k=0; k < filter_bytes; ++k) {
4698 case STBI__F_none : cur[k] = raw[k]; break;
4699 case STBI__F_sub : cur[k] = raw[k]; break;
4700 case STBI__F_up : cur[k] = STBI__BYTECAST(raw[k] + prior[k]); break;
4701 case STBI__F_avg : cur[k] = STBI__BYTECAST(raw[k] + (prior[k]>>1)); break;
4702 case STBI__F_paeth : cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(0,prior[k],0)); break;
4703 case STBI__F_avg_first : cur[k] = raw[k]; break;
4704 case STBI__F_paeth_first: cur[k] = raw[k]; break;
4710 cur[img_n] = 255; // first pixel
4714 } else if (depth == 16) {
4715 if (img_n != out_n) {
4716 cur[filter_bytes] = 255; // first pixel top byte
4717 cur[filter_bytes+1] = 255; // first pixel bottom byte
4719 raw += filter_bytes;
4720 cur += output_bytes;
4721 prior += output_bytes;
4728 // this is a little gross, so that we don't switch per-pixel or per-component
4729 if (depth < 8 || img_n == out_n) {
4730 int nk = (width - 1)*filter_bytes;
4731 #define STBI__CASE(f) \
4733 for (k=0; k < nk; ++k)
4735 // "none" filter turns into a memcpy here; make that explicit.
4736 case STBI__F_none: memcpy(cur, raw, nk); break;
4737 STBI__CASE(STBI__F_sub) { cur[k] = STBI__BYTECAST(raw[k] + cur[k-filter_bytes]); } break;
4738 STBI__CASE(STBI__F_up) { cur[k] = STBI__BYTECAST(raw[k] + prior[k]); } break;
4739 STBI__CASE(STBI__F_avg) { cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k-filter_bytes])>>1)); } break;
4740 STBI__CASE(STBI__F_paeth) { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],prior[k],prior[k-filter_bytes])); } break;
4741 STBI__CASE(STBI__F_avg_first) { cur[k] = STBI__BYTECAST(raw[k] + (cur[k-filter_bytes] >> 1)); } break;
4742 STBI__CASE(STBI__F_paeth_first) { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k-filter_bytes],0,0)); } break;
4747 STBI_ASSERT(img_n+1 == out_n);
4748 #define STBI__CASE(f) \
4750 for (i=x-1; i >= 1; --i, cur[filter_bytes]=255,raw+=filter_bytes,cur+=output_bytes,prior+=output_bytes) \
4751 for (k=0; k < filter_bytes; ++k)
4753 STBI__CASE(STBI__F_none) { cur[k] = raw[k]; } break;
4754 STBI__CASE(STBI__F_sub) { cur[k] = STBI__BYTECAST(raw[k] + cur[k- output_bytes]); } break;
4755 STBI__CASE(STBI__F_up) { cur[k] = STBI__BYTECAST(raw[k] + prior[k]); } break;
4756 STBI__CASE(STBI__F_avg) { cur[k] = STBI__BYTECAST(raw[k] + ((prior[k] + cur[k- output_bytes])>>1)); } break;
4757 STBI__CASE(STBI__F_paeth) { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],prior[k],prior[k- output_bytes])); } break;
4758 STBI__CASE(STBI__F_avg_first) { cur[k] = STBI__BYTECAST(raw[k] + (cur[k- output_bytes] >> 1)); } break;
4759 STBI__CASE(STBI__F_paeth_first) { cur[k] = STBI__BYTECAST(raw[k] + stbi__paeth(cur[k- output_bytes],0,0)); } break;
4763 // the loop above sets the high byte of the pixels' alpha, but for
4764 // 16 bit png files we also need the low byte set. we'll do that here.
4766 cur = a->out + stride*j; // start at the beginning of the row again
4767 for (i=0; i < x; ++i,cur+=output_bytes) {
4768 cur[filter_bytes+1] = 255;
4774 // we make a separate pass to expand bits to pixels; for performance,
4775 // this could run two scanlines behind the above code, so it won't
4776 // intefere with filtering but will still be in the cache.
4778 for (j=0; j < y; ++j) {
4779 stbi_uc *cur = a->out + stride*j;
4780 stbi_uc *in = a->out + stride*j + x*out_n - img_width_bytes;
4781 // unpack 1/2/4-bit into a 8-bit buffer. allows us to keep the common 8-bit path optimal at minimal cost for 1/2/4-bit
4782 // png guarante byte alignment, if width is not multiple of 8/4/2 we'll decode dummy trailing data that will be skipped in the later loop
4783 stbi_uc scale = (color == 0) ? stbi__depth_scale_table[depth] : 1; // scale grayscale values to 0..255 range
4785 // note that the final byte might overshoot and write more data than desired.
4786 // we can allocate enough data that this never writes out of memory, but it
4787 // could also overwrite the next scanline. can it overwrite non-empty data
4788 // on the next scanline? yes, consider 1-pixel-wide scanlines with 1-bit-per-pixel.
4789 // so we need to explicitly clamp the final ones
4792 for (k=x*img_n; k >= 2; k-=2, ++in) {
4793 *cur++ = scale * ((*in >> 4) );
4794 *cur++ = scale * ((*in ) & 0x0f);
4796 if (k > 0) *cur++ = scale * ((*in >> 4) );
4797 } else if (depth == 2) {
4798 for (k=x*img_n; k >= 4; k-=4, ++in) {
4799 *cur++ = scale * ((*in >> 6) );
4800 *cur++ = scale * ((*in >> 4) & 0x03);
4801 *cur++ = scale * ((*in >> 2) & 0x03);
4802 *cur++ = scale * ((*in ) & 0x03);
4804 if (k > 0) *cur++ = scale * ((*in >> 6) );
4805 if (k > 1) *cur++ = scale * ((*in >> 4) & 0x03);
4806 if (k > 2) *cur++ = scale * ((*in >> 2) & 0x03);
4807 } else if (depth == 1) {
4808 for (k=x*img_n; k >= 8; k-=8, ++in) {
4809 *cur++ = scale * ((*in >> 7) );
4810 *cur++ = scale * ((*in >> 6) & 0x01);
4811 *cur++ = scale * ((*in >> 5) & 0x01);
4812 *cur++ = scale * ((*in >> 4) & 0x01);
4813 *cur++ = scale * ((*in >> 3) & 0x01);
4814 *cur++ = scale * ((*in >> 2) & 0x01);
4815 *cur++ = scale * ((*in >> 1) & 0x01);
4816 *cur++ = scale * ((*in ) & 0x01);
4818 if (k > 0) *cur++ = scale * ((*in >> 7) );
4819 if (k > 1) *cur++ = scale * ((*in >> 6) & 0x01);
4820 if (k > 2) *cur++ = scale * ((*in >> 5) & 0x01);
4821 if (k > 3) *cur++ = scale * ((*in >> 4) & 0x01);
4822 if (k > 4) *cur++ = scale * ((*in >> 3) & 0x01);
4823 if (k > 5) *cur++ = scale * ((*in >> 2) & 0x01);
4824 if (k > 6) *cur++ = scale * ((*in >> 1) & 0x01);
4826 if (img_n != out_n) {
4828 // insert alpha = 255
4829 cur = a->out + stride*j;
4831 for (q=x-1; q >= 0; --q) {
4833 cur[q*2+0] = cur[q];
4836 STBI_ASSERT(img_n == 3);
4837 for (q=x-1; q >= 0; --q) {
4839 cur[q*4+2] = cur[q*3+2];
4840 cur[q*4+1] = cur[q*3+1];
4841 cur[q*4+0] = cur[q*3+0];
4846 } else if (depth == 16) {
4847 // force the image data from big-endian to platform-native.
4848 // this is done in a separate pass due to the decoding relying
4849 // on the data being untouched, but could probably be done
4850 // per-line during decode if care is taken.
4851 stbi_uc *cur = a->out;
4852 stbi__uint16 *cur16 = (stbi__uint16*)cur;
4854 for(i=0; i < x*y*out_n; ++i,cur16++,cur+=2) {
4855 *cur16 = (cur[0] << 8) | cur[1];
4862 static int stbi__create_png_image(stbi__png *a, stbi_uc *image_data, stbi__uint32 image_data_len, int out_n, int depth, int color, int interlaced)
4864 int bytes = (depth == 16 ? 2 : 1);
4865 int out_bytes = out_n * bytes;
4869 return stbi__create_png_image_raw(a, image_data, image_data_len, out_n, a->s->img_x, a->s->img_y, depth, color);
4872 final = (stbi_uc *) stbi__malloc_mad3(a->s->img_x, a->s->img_y, out_bytes, 0);
4873 if (!final) return stbi__err("outofmem", "Out of memory");
4874 for (p=0; p < 7; ++p) {
4875 int xorig[] = { 0,4,0,2,0,1,0 };
4876 int yorig[] = { 0,0,4,0,2,0,1 };
4877 int xspc[] = { 8,8,4,4,2,2,1 };
4878 int yspc[] = { 8,8,8,4,4,2,2 };
4880 // pass1_x[4] = 0, pass1_x[5] = 1, pass1_x[12] = 1
4881 x = (a->s->img_x - xorig[p] + xspc[p]-1) / xspc[p];
4882 y = (a->s->img_y - yorig[p] + yspc[p]-1) / yspc[p];
4884 stbi__uint32 img_len = ((((a->s->img_n * x * depth) + 7) >> 3) + 1) * y;
4885 if (!stbi__create_png_image_raw(a, image_data, image_data_len, out_n, x, y, depth, color)) {
4889 for (j=0; j < y; ++j) {
4890 for (i=0; i < x; ++i) {
4891 int out_y = j*yspc[p]+yorig[p];
4892 int out_x = i*xspc[p]+xorig[p];
4893 memcpy(final + out_y*a->s->img_x*out_bytes + out_x*out_bytes,
4894 a->out + (j*x+i)*out_bytes, out_bytes);
4898 image_data += img_len;
4899 image_data_len -= img_len;
4907 static int stbi__compute_transparency(stbi__png *z, stbi_uc tc[3], int out_n)
4909 stbi__context *s = z->s;
4910 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4911 stbi_uc *p = z->out;
4913 // compute color-based transparency, assuming we've
4914 // already got 255 as the alpha value in the output
4915 STBI_ASSERT(out_n == 2 || out_n == 4);
4918 for (i=0; i < pixel_count; ++i) {
4919 p[1] = (p[0] == tc[0] ? 0 : 255);
4923 for (i=0; i < pixel_count; ++i) {
4924 if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4932 static int stbi__compute_transparency16(stbi__png *z, stbi__uint16 tc[3], int out_n)
4934 stbi__context *s = z->s;
4935 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
4936 stbi__uint16 *p = (stbi__uint16*) z->out;
4938 // compute color-based transparency, assuming we've
4939 // already got 65535 as the alpha value in the output
4940 STBI_ASSERT(out_n == 2 || out_n == 4);
4943 for (i = 0; i < pixel_count; ++i) {
4944 p[1] = (p[0] == tc[0] ? 0 : 65535);
4948 for (i = 0; i < pixel_count; ++i) {
4949 if (p[0] == tc[0] && p[1] == tc[1] && p[2] == tc[2])
4957 static int stbi__expand_png_palette(stbi__png *a, stbi_uc *palette, int len, int pal_img_n)
4959 stbi__uint32 i, pixel_count = a->s->img_x * a->s->img_y;
4960 stbi_uc *p, *temp_out, *orig = a->out;
4962 p = (stbi_uc *) stbi__malloc_mad2(pixel_count, pal_img_n, 0);
4963 if (p == NULL) return stbi__err("outofmem", "Out of memory");
4965 // between here and free(out) below, exitting would leak
4968 if (pal_img_n == 3) {
4969 for (i=0; i < pixel_count; ++i) {
4972 p[1] = palette[n+1];
4973 p[2] = palette[n+2];
4977 for (i=0; i < pixel_count; ++i) {
4980 p[1] = palette[n+1];
4981 p[2] = palette[n+2];
4982 p[3] = palette[n+3];
4994 static int stbi__unpremultiply_on_load_global = 0;
4995 static int stbi__de_iphone_flag_global = 0;
4997 STBIDEF void stbi_set_unpremultiply_on_load(int flag_true_if_should_unpremultiply)
4999 stbi__unpremultiply_on_load_global = flag_true_if_should_unpremultiply;
5002 STBIDEF void stbi_convert_iphone_png_to_rgb(int flag_true_if_should_convert)
5004 stbi__de_iphone_flag_global = flag_true_if_should_convert;
5007 #ifndef STBI_THREAD_LOCAL
5008 #define stbi__unpremultiply_on_load stbi__unpremultiply_on_load_global
5009 #define stbi__de_iphone_flag stbi__de_iphone_flag_global
5011 static STBI_THREAD_LOCAL int stbi__unpremultiply_on_load_local, stbi__unpremultiply_on_load_set;
5012 static STBI_THREAD_LOCAL int stbi__de_iphone_flag_local, stbi__de_iphone_flag_set;
5014 STBIDEF void stbi_set_unpremultiply_on_load_thread(int flag_true_if_should_unpremultiply)
5016 stbi__unpremultiply_on_load_local = flag_true_if_should_unpremultiply;
5017 stbi__unpremultiply_on_load_set = 1;
5020 STBIDEF void stbi_convert_iphone_png_to_rgb_thread(int flag_true_if_should_convert)
5022 stbi__de_iphone_flag_local = flag_true_if_should_convert;
5023 stbi__de_iphone_flag_set = 1;
5026 #define stbi__unpremultiply_on_load (stbi__unpremultiply_on_load_set \
5027 ? stbi__unpremultiply_on_load_local \
5028 : stbi__unpremultiply_on_load_global)
5029 #define stbi__de_iphone_flag (stbi__de_iphone_flag_set \
5030 ? stbi__de_iphone_flag_local \
5031 : stbi__de_iphone_flag_global)
5032 #endif // STBI_THREAD_LOCAL
5034 static void stbi__de_iphone(stbi__png *z)
5036 stbi__context *s = z->s;
5037 stbi__uint32 i, pixel_count = s->img_x * s->img_y;
5038 stbi_uc *p = z->out;
5040 if (s->img_out_n == 3) { // convert bgr to rgb
5041 for (i=0; i < pixel_count; ++i) {
5048 STBI_ASSERT(s->img_out_n == 4);
5049 if (stbi__unpremultiply_on_load) {
5050 // convert bgr to rgb and unpremultiply
5051 for (i=0; i < pixel_count; ++i) {
5055 stbi_uc half = a / 2;
5056 p[0] = (p[2] * 255 + half) / a;
5057 p[1] = (p[1] * 255 + half) / a;
5058 p[2] = ( t * 255 + half) / a;
5066 // convert bgr to rgb
5067 for (i=0; i < pixel_count; ++i) {
5077 #define STBI__PNG_TYPE(a,b,c,d) (((unsigned) (a) << 24) + ((unsigned) (b) << 16) + ((unsigned) (c) << 8) + (unsigned) (d))
5079 static int stbi__parse_png_file(stbi__png *z, int scan, int req_comp)
5081 stbi_uc palette[1024], pal_img_n=0;
5082 stbi_uc has_trans=0, tc[3]={0};
5083 stbi__uint16 tc16[3];
5084 stbi__uint32 ioff=0, idata_limit=0, i, pal_len=0;
5085 int first=1,k,interlace=0, color=0, is_iphone=0;
5086 stbi__context *s = z->s;
5092 if (!stbi__check_png_header(s)) return 0;
5094 if (scan == STBI__SCAN_type) return 1;
5097 stbi__pngchunk c = stbi__get_chunk_header(s);
5099 case STBI__PNG_TYPE('C','g','B','I'):
5101 stbi__skip(s, c.length);
5103 case STBI__PNG_TYPE('I','H','D','R'): {
5105 if (!first) return stbi__err("multiple IHDR","Corrupt PNG");
5107 if (c.length != 13) return stbi__err("bad IHDR len","Corrupt PNG");
5108 s->img_x = stbi__get32be(s);
5109 s->img_y = stbi__get32be(s);
5110 if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
5111 if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
5112 z->depth = stbi__get8(s); if (z->depth != 1 && z->depth != 2 && z->depth != 4 && z->depth != 8 && z->depth != 16) return stbi__err("1/2/4/8/16-bit only","PNG not supported: 1/2/4/8/16-bit only");
5113 color = stbi__get8(s); if (color > 6) return stbi__err("bad ctype","Corrupt PNG");
5114 if (color == 3 && z->depth == 16) return stbi__err("bad ctype","Corrupt PNG");
5115 if (color == 3) pal_img_n = 3; else if (color & 1) return stbi__err("bad ctype","Corrupt PNG");
5116 comp = stbi__get8(s); if (comp) return stbi__err("bad comp method","Corrupt PNG");
5117 filter= stbi__get8(s); if (filter) return stbi__err("bad filter method","Corrupt PNG");
5118 interlace = stbi__get8(s); if (interlace>1) return stbi__err("bad interlace method","Corrupt PNG");
5119 if (!s->img_x || !s->img_y) return stbi__err("0-pixel image","Corrupt PNG");
5121 s->img_n = (color & 2 ? 3 : 1) + (color & 4 ? 1 : 0);
5122 if ((1 << 30) / s->img_x / s->img_n < s->img_y) return stbi__err("too large", "Image too large to decode");
5124 // if paletted, then pal_n is our final components, and
5125 // img_n is # components to decompress/filter.
5127 if ((1 << 30) / s->img_x / 4 < s->img_y) return stbi__err("too large","Corrupt PNG");
5129 // even with SCAN_header, have to scan to see if we have a tRNS
5133 case STBI__PNG_TYPE('P','L','T','E'): {
5134 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
5135 if (c.length > 256*3) return stbi__err("invalid PLTE","Corrupt PNG");
5136 pal_len = c.length / 3;
5137 if (pal_len * 3 != c.length) return stbi__err("invalid PLTE","Corrupt PNG");
5138 for (i=0; i < pal_len; ++i) {
5139 palette[i*4+0] = stbi__get8(s);
5140 palette[i*4+1] = stbi__get8(s);
5141 palette[i*4+2] = stbi__get8(s);
5142 palette[i*4+3] = 255;
5147 case STBI__PNG_TYPE('t','R','N','S'): {
5148 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
5149 if (z->idata) return stbi__err("tRNS after IDAT","Corrupt PNG");
5151 if (scan == STBI__SCAN_header) { s->img_n = 4; return 1; }
5152 if (pal_len == 0) return stbi__err("tRNS before PLTE","Corrupt PNG");
5153 if (c.length > pal_len) return stbi__err("bad tRNS len","Corrupt PNG");
5155 for (i=0; i < c.length; ++i)
5156 palette[i*4+3] = stbi__get8(s);
5158 if (!(s->img_n & 1)) return stbi__err("tRNS with alpha","Corrupt PNG");
5159 if (c.length != (stbi__uint32) s->img_n*2) return stbi__err("bad tRNS len","Corrupt PNG");
5161 // non-paletted with tRNS = constant alpha. if header-scanning, we can stop now.
5162 if (scan == STBI__SCAN_header) { ++s->img_n; return 1; }
5163 if (z->depth == 16) {
5164 for (k = 0; k < s->img_n; ++k) tc16[k] = (stbi__uint16)stbi__get16be(s); // copy the values as-is
5166 for (k = 0; k < s->img_n; ++k) tc[k] = (stbi_uc)(stbi__get16be(s) & 255) * stbi__depth_scale_table[z->depth]; // non 8-bit images will be larger
5172 case STBI__PNG_TYPE('I','D','A','T'): {
5173 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
5174 if (pal_img_n && !pal_len) return stbi__err("no PLTE","Corrupt PNG");
5175 if (scan == STBI__SCAN_header) {
5176 // header scan definitely stops at first IDAT
5178 s->img_n = pal_img_n;
5181 if (c.length > (1u << 30)) return stbi__err("IDAT size limit", "IDAT section larger than 2^30 bytes");
5182 if ((int)(ioff + c.length) < (int)ioff) return 0;
5183 if (ioff + c.length > idata_limit) {
5184 stbi__uint32 idata_limit_old = idata_limit;
5186 if (idata_limit == 0) idata_limit = c.length > 4096 ? c.length : 4096;
5187 while (ioff + c.length > idata_limit)
5189 STBI_NOTUSED(idata_limit_old);
5190 p = (stbi_uc *) STBI_REALLOC_SIZED(z->idata, idata_limit_old, idata_limit); if (p == NULL) return stbi__err("outofmem", "Out of memory");
5193 if (!stbi__getn(s, z->idata+ioff,c.length)) return stbi__err("outofdata","Corrupt PNG");
5198 case STBI__PNG_TYPE('I','E','N','D'): {
5199 stbi__uint32 raw_len, bpl;
5200 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
5201 if (scan != STBI__SCAN_load) return 1;
5202 if (z->idata == NULL) return stbi__err("no IDAT","Corrupt PNG");
5203 // initial guess for decoded data size to avoid unnecessary reallocs
5204 bpl = (s->img_x * z->depth + 7) / 8; // bytes per line, per component
5205 raw_len = bpl * s->img_y * s->img_n /* pixels */ + s->img_y /* filter mode per row */;
5206 z->expanded = (stbi_uc *) stbi_zlib_decode_malloc_guesssize_headerflag((char *) z->idata, ioff, raw_len, (int *) &raw_len, !is_iphone);
5207 if (z->expanded == NULL) return 0; // zlib should set error
5208 STBI_FREE(z->idata); z->idata = NULL;
5209 if ((req_comp == s->img_n+1 && req_comp != 3 && !pal_img_n) || has_trans)
5210 s->img_out_n = s->img_n+1;
5212 s->img_out_n = s->img_n;
5213 if (!stbi__create_png_image(z, z->expanded, raw_len, s->img_out_n, z->depth, color, interlace)) return 0;
5215 if (z->depth == 16) {
5216 if (!stbi__compute_transparency16(z, tc16, s->img_out_n)) return 0;
5218 if (!stbi__compute_transparency(z, tc, s->img_out_n)) return 0;
5221 if (is_iphone && stbi__de_iphone_flag && s->img_out_n > 2)
5224 // pal_img_n == 3 or 4
5225 s->img_n = pal_img_n; // record the actual colors we had
5226 s->img_out_n = pal_img_n;
5227 if (req_comp >= 3) s->img_out_n = req_comp;
5228 if (!stbi__expand_png_palette(z, palette, pal_len, s->img_out_n))
5230 } else if (has_trans) {
5231 // non-paletted image with tRNS -> source image has (constant) alpha
5234 STBI_FREE(z->expanded); z->expanded = NULL;
5235 // end of PNG chunk, read and skip CRC
5241 // if critical, fail
5242 if (first) return stbi__err("first not IHDR", "Corrupt PNG");
5243 if ((c.type & (1 << 29)) == 0) {
5244 #ifndef STBI_NO_FAILURE_STRINGS
5246 static char invalid_chunk[] = "XXXX PNG chunk not known";
5247 invalid_chunk[0] = STBI__BYTECAST(c.type >> 24);
5248 invalid_chunk[1] = STBI__BYTECAST(c.type >> 16);
5249 invalid_chunk[2] = STBI__BYTECAST(c.type >> 8);
5250 invalid_chunk[3] = STBI__BYTECAST(c.type >> 0);
5252 return stbi__err(invalid_chunk, "PNG not supported: unknown PNG chunk type");
5254 stbi__skip(s, c.length);
5257 // end of PNG chunk, read and skip CRC
5262 static void *stbi__do_png(stbi__png *p, int *x, int *y, int *n, int req_comp, stbi__result_info *ri)
5265 if (req_comp < 0 || req_comp > 4) return stbi__errpuc("bad req_comp", "Internal error");
5266 if (stbi__parse_png_file(p, STBI__SCAN_load, req_comp)) {
5268 ri->bits_per_channel = 8;
5269 else if (p->depth == 16)
5270 ri->bits_per_channel = 16;
5272 return stbi__errpuc("bad bits_per_channel", "PNG not supported: unsupported color depth");
5275 if (req_comp && req_comp != p->s->img_out_n) {
5276 if (ri->bits_per_channel == 8)
5277 result = stbi__convert_format((unsigned char *) result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
5279 result = stbi__convert_format16((stbi__uint16 *) result, p->s->img_out_n, req_comp, p->s->img_x, p->s->img_y);
5280 p->s->img_out_n = req_comp;
5281 if (result == NULL) return result;
5285 if (n) *n = p->s->img_n;
5287 STBI_FREE(p->out); p->out = NULL;
5288 STBI_FREE(p->expanded); p->expanded = NULL;
5289 STBI_FREE(p->idata); p->idata = NULL;
5294 static void *stbi__png_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
5298 return stbi__do_png(&p, x,y,comp,req_comp, ri);
5301 static int stbi__png_test(stbi__context *s)
5304 r = stbi__check_png_header(s);
5309 static int stbi__png_info_raw(stbi__png *p, int *x, int *y, int *comp)
5311 if (!stbi__parse_png_file(p, STBI__SCAN_header, 0)) {
5312 stbi__rewind( p->s );
5315 if (x) *x = p->s->img_x;
5316 if (y) *y = p->s->img_y;
5317 if (comp) *comp = p->s->img_n;
5321 static int stbi__png_info(stbi__context *s, int *x, int *y, int *comp)
5325 return stbi__png_info_raw(&p, x, y, comp);
5328 static int stbi__png_is16(stbi__context *s)
5332 if (!stbi__png_info_raw(&p, NULL, NULL, NULL))
5334 if (p.depth != 16) {
5342 // Microsoft/Windows BMP image
5345 static int stbi__bmp_test_raw(stbi__context *s)
5349 if (stbi__get8(s) != 'B') return 0;
5350 if (stbi__get8(s) != 'M') return 0;
5351 stbi__get32le(s); // discard filesize
5352 stbi__get16le(s); // discard reserved
5353 stbi__get16le(s); // discard reserved
5354 stbi__get32le(s); // discard data offset
5355 sz = stbi__get32le(s);
5356 r = (sz == 12 || sz == 40 || sz == 56 || sz == 108 || sz == 124);
5360 static int stbi__bmp_test(stbi__context *s)
5362 int r = stbi__bmp_test_raw(s);
5368 // returns 0..31 for the highest set bit
5369 static int stbi__high_bit(unsigned int z)
5372 if (z == 0) return -1;
5373 if (z >= 0x10000) { n += 16; z >>= 16; }
5374 if (z >= 0x00100) { n += 8; z >>= 8; }
5375 if (z >= 0x00010) { n += 4; z >>= 4; }
5376 if (z >= 0x00004) { n += 2; z >>= 2; }
5377 if (z >= 0x00002) { n += 1;/* >>= 1;*/ }
5381 static int stbi__bitcount(unsigned int a)
5383 a = (a & 0x55555555) + ((a >> 1) & 0x55555555); // max 2
5384 a = (a & 0x33333333) + ((a >> 2) & 0x33333333); // max 4
5385 a = (a + (a >> 4)) & 0x0f0f0f0f; // max 8 per 4, now 8 bits
5386 a = (a + (a >> 8)); // max 16 per 8 bits
5387 a = (a + (a >> 16)); // max 32 per 8 bits
5391 // extract an arbitrarily-aligned N-bit value (N=bits)
5392 // from v, and then make it 8-bits long and fractionally
5393 // extend it to full full range.
5394 static int stbi__shiftsigned(unsigned int v, int shift, int bits)
5396 static unsigned int mul_table[9] = {
5398 0xff/*0b11111111*/, 0x55/*0b01010101*/, 0x49/*0b01001001*/, 0x11/*0b00010001*/,
5399 0x21/*0b00100001*/, 0x41/*0b01000001*/, 0x81/*0b10000001*/, 0x01/*0b00000001*/,
5401 static unsigned int shift_table[9] = {
5408 STBI_ASSERT(v < 256);
5410 STBI_ASSERT(bits >= 0 && bits <= 8);
5411 return (int) ((unsigned) v * mul_table[bits]) >> shift_table[bits];
5416 int bpp, offset, hsz;
5417 unsigned int mr,mg,mb,ma, all_a;
5421 static int stbi__bmp_set_mask_defaults(stbi__bmp_data *info, int compress)
5423 // BI_BITFIELDS specifies masks explicitly, don't override
5427 if (compress == 0) {
5428 if (info->bpp == 16) {
5429 info->mr = 31u << 10;
5430 info->mg = 31u << 5;
5431 info->mb = 31u << 0;
5432 } else if (info->bpp == 32) {
5433 info->mr = 0xffu << 16;
5434 info->mg = 0xffu << 8;
5435 info->mb = 0xffu << 0;
5436 info->ma = 0xffu << 24;
5437 info->all_a = 0; // if all_a is 0 at end, then we loaded alpha channel but it was all 0
5439 // otherwise, use defaults, which is all-0
5440 info->mr = info->mg = info->mb = info->ma = 0;
5447 static void *stbi__bmp_parse_header(stbi__context *s, stbi__bmp_data *info)
5450 if (stbi__get8(s) != 'B' || stbi__get8(s) != 'M') return stbi__errpuc("not BMP", "Corrupt BMP");
5451 stbi__get32le(s); // discard filesize
5452 stbi__get16le(s); // discard reserved
5453 stbi__get16le(s); // discard reserved
5454 info->offset = stbi__get32le(s);
5455 info->hsz = hsz = stbi__get32le(s);
5456 info->mr = info->mg = info->mb = info->ma = 0;
5457 info->extra_read = 14;
5459 if (info->offset < 0) return stbi__errpuc("bad BMP", "bad BMP");
5461 if (hsz != 12 && hsz != 40 && hsz != 56 && hsz != 108 && hsz != 124) return stbi__errpuc("unknown BMP", "BMP type not supported: unknown");
5463 s->img_x = stbi__get16le(s);
5464 s->img_y = stbi__get16le(s);
5466 s->img_x = stbi__get32le(s);
5467 s->img_y = stbi__get32le(s);
5469 if (stbi__get16le(s) != 1) return stbi__errpuc("bad BMP", "bad BMP");
5470 info->bpp = stbi__get16le(s);
5472 int compress = stbi__get32le(s);
5473 if (compress == 1 || compress == 2) return stbi__errpuc("BMP RLE", "BMP type not supported: RLE");
5474 if (compress >= 4) return stbi__errpuc("BMP JPEG/PNG", "BMP type not supported: unsupported compression"); // this includes PNG/JPEG modes
5475 if (compress == 3 && info->bpp != 16 && info->bpp != 32) return stbi__errpuc("bad BMP", "bad BMP"); // bitfields requires 16 or 32 bits/pixel
5476 stbi__get32le(s); // discard sizeof
5477 stbi__get32le(s); // discard hres
5478 stbi__get32le(s); // discard vres
5479 stbi__get32le(s); // discard colorsused
5480 stbi__get32le(s); // discard max important
5481 if (hsz == 40 || hsz == 56) {
5488 if (info->bpp == 16 || info->bpp == 32) {
5489 if (compress == 0) {
5490 stbi__bmp_set_mask_defaults(info, compress);
5491 } else if (compress == 3) {
5492 info->mr = stbi__get32le(s);
5493 info->mg = stbi__get32le(s);
5494 info->mb = stbi__get32le(s);
5495 info->extra_read += 12;
5496 // not documented, but generated by photoshop and handled by mspaint
5497 if (info->mr == info->mg && info->mg == info->mb) {
5499 return stbi__errpuc("bad BMP", "bad BMP");
5502 return stbi__errpuc("bad BMP", "bad BMP");
5507 if (hsz != 108 && hsz != 124)
5508 return stbi__errpuc("bad BMP", "bad BMP");
5509 info->mr = stbi__get32le(s);
5510 info->mg = stbi__get32le(s);
5511 info->mb = stbi__get32le(s);
5512 info->ma = stbi__get32le(s);
5513 if (compress != 3) // override mr/mg/mb unless in BI_BITFIELDS mode, as per docs
5514 stbi__bmp_set_mask_defaults(info, compress);
5515 stbi__get32le(s); // discard color space
5516 for (i=0; i < 12; ++i)
5517 stbi__get32le(s); // discard color space parameters
5519 stbi__get32le(s); // discard rendering intent
5520 stbi__get32le(s); // discard offset of profile data
5521 stbi__get32le(s); // discard size of profile data
5522 stbi__get32le(s); // discard reserved
5530 static void *stbi__bmp_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
5533 unsigned int mr=0,mg=0,mb=0,ma=0, all_a;
5534 stbi_uc pal[256][4];
5535 int psize=0,i,j,width;
5536 int flip_vertically, pad, target;
5537 stbi__bmp_data info;
5541 if (stbi__bmp_parse_header(s, &info) == NULL)
5542 return NULL; // error code already set
5544 flip_vertically = ((int) s->img_y) > 0;
5545 s->img_y = abs((int) s->img_y);
5547 if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
5548 if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
5556 if (info.hsz == 12) {
5558 psize = (info.offset - info.extra_read - 24) / 3;
5561 psize = (info.offset - info.extra_read - info.hsz) >> 2;
5564 // accept some number of extra bytes after the header, but if the offset points either to before
5565 // the header ends or implies a large amount of extra data, reject the file as malformed
5566 int bytes_read_so_far = s->callback_already_read + (int)(s->img_buffer - s->img_buffer_original);
5567 int header_limit = 1024; // max we actually read is below 256 bytes currently.
5568 int extra_data_limit = 256*4; // what ordinarily goes here is a palette; 256 entries*4 bytes is its max size.
5569 if (bytes_read_so_far <= 0 || bytes_read_so_far > header_limit) {
5570 return stbi__errpuc("bad header", "Corrupt BMP");
5572 // we established that bytes_read_so_far is positive and sensible.
5573 // the first half of this test rejects offsets that are either too small positives, or
5574 // negative, and guarantees that info.offset >= bytes_read_so_far > 0. this in turn
5575 // ensures the number computed in the second half of the test can't overflow.
5576 if (info.offset < bytes_read_so_far || info.offset - bytes_read_so_far > extra_data_limit) {
5577 return stbi__errpuc("bad offset", "Corrupt BMP");
5579 stbi__skip(s, info.offset - bytes_read_so_far);
5583 if (info.bpp == 24 && ma == 0xff000000)
5586 s->img_n = ma ? 4 : 3;
5587 if (req_comp && req_comp >= 3) // we can directly decode 3 or 4
5590 target = s->img_n; // if they want monochrome, we'll post-convert
5592 // sanity-check size
5593 if (!stbi__mad3sizes_valid(target, s->img_x, s->img_y, 0))
5594 return stbi__errpuc("too large", "Corrupt BMP");
5596 out = (stbi_uc *) stbi__malloc_mad3(target, s->img_x, s->img_y, 0);
5597 if (!out) return stbi__errpuc("outofmem", "Out of memory");
5598 if (info.bpp < 16) {
5600 if (psize == 0 || psize > 256) { STBI_FREE(out); return stbi__errpuc("invalid", "Corrupt BMP"); }
5601 for (i=0; i < psize; ++i) {
5602 pal[i][2] = stbi__get8(s);
5603 pal[i][1] = stbi__get8(s);
5604 pal[i][0] = stbi__get8(s);
5605 if (info.hsz != 12) stbi__get8(s);
5608 stbi__skip(s, info.offset - info.extra_read - info.hsz - psize * (info.hsz == 12 ? 3 : 4));
5609 if (info.bpp == 1) width = (s->img_x + 7) >> 3;
5610 else if (info.bpp == 4) width = (s->img_x + 1) >> 1;
5611 else if (info.bpp == 8) width = s->img_x;
5612 else { STBI_FREE(out); return stbi__errpuc("bad bpp", "Corrupt BMP"); }
5614 if (info.bpp == 1) {
5615 for (j=0; j < (int) s->img_y; ++j) {
5616 int bit_offset = 7, v = stbi__get8(s);
5617 for (i=0; i < (int) s->img_x; ++i) {
5618 int color = (v>>bit_offset)&0x1;
5619 out[z++] = pal[color][0];
5620 out[z++] = pal[color][1];
5621 out[z++] = pal[color][2];
5622 if (target == 4) out[z++] = 255;
5623 if (i+1 == (int) s->img_x) break;
5624 if((--bit_offset) < 0) {
5632 for (j=0; j < (int) s->img_y; ++j) {
5633 for (i=0; i < (int) s->img_x; i += 2) {
5634 int v=stbi__get8(s),v2=0;
5635 if (info.bpp == 4) {
5639 out[z++] = pal[v][0];
5640 out[z++] = pal[v][1];
5641 out[z++] = pal[v][2];
5642 if (target == 4) out[z++] = 255;
5643 if (i+1 == (int) s->img_x) break;
5644 v = (info.bpp == 8) ? stbi__get8(s) : v2;
5645 out[z++] = pal[v][0];
5646 out[z++] = pal[v][1];
5647 out[z++] = pal[v][2];
5648 if (target == 4) out[z++] = 255;
5654 int rshift=0,gshift=0,bshift=0,ashift=0,rcount=0,gcount=0,bcount=0,acount=0;
5657 stbi__skip(s, info.offset - info.extra_read - info.hsz);
5658 if (info.bpp == 24) width = 3 * s->img_x;
5659 else if (info.bpp == 16) width = 2*s->img_x;
5660 else /* bpp = 32 and pad = 0 */ width=0;
5662 if (info.bpp == 24) {
5664 } else if (info.bpp == 32) {
5665 if (mb == 0xff && mg == 0xff00 && mr == 0x00ff0000 && ma == 0xff000000)
5669 if (!mr || !mg || !mb) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
5670 // right shift amt to put high bit in position #7
5671 rshift = stbi__high_bit(mr)-7; rcount = stbi__bitcount(mr);
5672 gshift = stbi__high_bit(mg)-7; gcount = stbi__bitcount(mg);
5673 bshift = stbi__high_bit(mb)-7; bcount = stbi__bitcount(mb);
5674 ashift = stbi__high_bit(ma)-7; acount = stbi__bitcount(ma);
5675 if (rcount > 8 || gcount > 8 || bcount > 8 || acount > 8) { STBI_FREE(out); return stbi__errpuc("bad masks", "Corrupt BMP"); }
5677 for (j=0; j < (int) s->img_y; ++j) {
5679 for (i=0; i < (int) s->img_x; ++i) {
5681 out[z+2] = stbi__get8(s);
5682 out[z+1] = stbi__get8(s);
5683 out[z+0] = stbi__get8(s);
5685 a = (easy == 2 ? stbi__get8(s) : 255);
5687 if (target == 4) out[z++] = a;
5691 for (i=0; i < (int) s->img_x; ++i) {
5692 stbi__uint32 v = (bpp == 16 ? (stbi__uint32) stbi__get16le(s) : stbi__get32le(s));
5694 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mr, rshift, rcount));
5695 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mg, gshift, gcount));
5696 out[z++] = STBI__BYTECAST(stbi__shiftsigned(v & mb, bshift, bcount));
5697 a = (ma ? stbi__shiftsigned(v & ma, ashift, acount) : 255);
5699 if (target == 4) out[z++] = STBI__BYTECAST(a);
5706 // if alpha channel is all 0s, replace with all 255s
5707 if (target == 4 && all_a == 0)
5708 for (i=4*s->img_x*s->img_y-1; i >= 0; i -= 4)
5711 if (flip_vertically) {
5713 for (j=0; j < (int) s->img_y>>1; ++j) {
5714 stbi_uc *p1 = out + j *s->img_x*target;
5715 stbi_uc *p2 = out + (s->img_y-1-j)*s->img_x*target;
5716 for (i=0; i < (int) s->img_x*target; ++i) {
5717 t = p1[i]; p1[i] = p2[i]; p2[i] = t;
5722 if (req_comp && req_comp != target) {
5723 out = stbi__convert_format(out, target, req_comp, s->img_x, s->img_y);
5724 if (out == NULL) return out; // stbi__convert_format frees input on failure
5729 if (comp) *comp = s->img_n;
5734 // Targa Truevision - TGA
5735 // by Jonathan Dummer
5737 // returns STBI_rgb or whatever, 0 on error
5738 static int stbi__tga_get_comp(int bits_per_pixel, int is_grey, int* is_rgb16)
5740 // only RGB or RGBA (incl. 16bit) or grey allowed
5741 if (is_rgb16) *is_rgb16 = 0;
5742 switch(bits_per_pixel) {
5743 case 8: return STBI_grey;
5744 case 16: if(is_grey) return STBI_grey_alpha;
5746 case 15: if(is_rgb16) *is_rgb16 = 1;
5748 case 24: // fallthrough
5749 case 32: return bits_per_pixel/8;
5754 static int stbi__tga_info(stbi__context *s, int *x, int *y, int *comp)
5756 int tga_w, tga_h, tga_comp, tga_image_type, tga_bits_per_pixel, tga_colormap_bpp;
5757 int sz, tga_colormap_type;
5758 stbi__get8(s); // discard Offset
5759 tga_colormap_type = stbi__get8(s); // colormap type
5760 if( tga_colormap_type > 1 ) {
5762 return 0; // only RGB or indexed allowed
5764 tga_image_type = stbi__get8(s); // image type
5765 if ( tga_colormap_type == 1 ) { // colormapped (paletted) image
5766 if (tga_image_type != 1 && tga_image_type != 9) {
5770 stbi__skip(s,4); // skip index of first colormap entry and number of entries
5771 sz = stbi__get8(s); // check bits per palette color entry
5772 if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) {
5776 stbi__skip(s,4); // skip image x and y origin
5777 tga_colormap_bpp = sz;
5778 } else { // "normal" image w/o colormap - only RGB or grey allowed, +/- RLE
5779 if ( (tga_image_type != 2) && (tga_image_type != 3) && (tga_image_type != 10) && (tga_image_type != 11) ) {
5781 return 0; // only RGB or grey allowed, +/- RLE
5783 stbi__skip(s,9); // skip colormap specification and image x/y origin
5784 tga_colormap_bpp = 0;
5786 tga_w = stbi__get16le(s);
5789 return 0; // test width
5791 tga_h = stbi__get16le(s);
5794 return 0; // test height
5796 tga_bits_per_pixel = stbi__get8(s); // bits per pixel
5797 stbi__get8(s); // ignore alpha bits
5798 if (tga_colormap_bpp != 0) {
5799 if((tga_bits_per_pixel != 8) && (tga_bits_per_pixel != 16)) {
5800 // when using a colormap, tga_bits_per_pixel is the size of the indexes
5801 // I don't think anything but 8 or 16bit indexes makes sense
5805 tga_comp = stbi__tga_get_comp(tga_colormap_bpp, 0, NULL);
5807 tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3) || (tga_image_type == 11), NULL);
5815 if (comp) *comp = tga_comp;
5816 return 1; // seems to have passed everything
5819 static int stbi__tga_test(stbi__context *s)
5822 int sz, tga_color_type;
5823 stbi__get8(s); // discard Offset
5824 tga_color_type = stbi__get8(s); // color type
5825 if ( tga_color_type > 1 ) goto errorEnd; // only RGB or indexed allowed
5826 sz = stbi__get8(s); // image type
5827 if ( tga_color_type == 1 ) { // colormapped (paletted) image
5828 if (sz != 1 && sz != 9) goto errorEnd; // colortype 1 demands image type 1 or 9
5829 stbi__skip(s,4); // skip index of first colormap entry and number of entries
5830 sz = stbi__get8(s); // check bits per palette color entry
5831 if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
5832 stbi__skip(s,4); // skip image x and y origin
5833 } else { // "normal" image w/o colormap
5834 if ( (sz != 2) && (sz != 3) && (sz != 10) && (sz != 11) ) goto errorEnd; // only RGB or grey allowed, +/- RLE
5835 stbi__skip(s,9); // skip colormap specification and image x/y origin
5837 if ( stbi__get16le(s) < 1 ) goto errorEnd; // test width
5838 if ( stbi__get16le(s) < 1 ) goto errorEnd; // test height
5839 sz = stbi__get8(s); // bits per pixel
5840 if ( (tga_color_type == 1) && (sz != 8) && (sz != 16) ) goto errorEnd; // for colormapped images, bpp is size of an index
5841 if ( (sz != 8) && (sz != 15) && (sz != 16) && (sz != 24) && (sz != 32) ) goto errorEnd;
5843 res = 1; // if we got this far, everything's good and we can return 1 instead of 0
5850 // read 16bit value and convert to 24bit RGB
5851 static void stbi__tga_read_rgb16(stbi__context *s, stbi_uc* out)
5853 stbi__uint16 px = (stbi__uint16)stbi__get16le(s);
5854 stbi__uint16 fiveBitMask = 31;
5855 // we have 3 channels with 5bits each
5856 int r = (px >> 10) & fiveBitMask;
5857 int g = (px >> 5) & fiveBitMask;
5858 int b = px & fiveBitMask;
5859 // Note that this saves the data in RGB(A) order, so it doesn't need to be swapped later
5860 out[0] = (stbi_uc)((r * 255)/31);
5861 out[1] = (stbi_uc)((g * 255)/31);
5862 out[2] = (stbi_uc)((b * 255)/31);
5864 // some people claim that the most significant bit might be used for alpha
5865 // (possibly if an alpha-bit is set in the "image descriptor byte")
5866 // but that only made 16bit test images completely translucent..
5867 // so let's treat all 15 and 16bit TGAs as RGB with no alpha.
5870 static void *stbi__tga_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
5872 // read in the TGA header stuff
5873 int tga_offset = stbi__get8(s);
5874 int tga_indexed = stbi__get8(s);
5875 int tga_image_type = stbi__get8(s);
5877 int tga_palette_start = stbi__get16le(s);
5878 int tga_palette_len = stbi__get16le(s);
5879 int tga_palette_bits = stbi__get8(s);
5880 int tga_x_origin = stbi__get16le(s);
5881 int tga_y_origin = stbi__get16le(s);
5882 int tga_width = stbi__get16le(s);
5883 int tga_height = stbi__get16le(s);
5884 int tga_bits_per_pixel = stbi__get8(s);
5885 int tga_comp, tga_rgb16=0;
5886 int tga_inverted = stbi__get8(s);
5887 // int tga_alpha_bits = tga_inverted & 15; // the 4 lowest bits - unused (useless?)
5889 unsigned char *tga_data;
5890 unsigned char *tga_palette = NULL;
5892 unsigned char raw_data[4] = {0};
5894 int RLE_repeating = 0;
5895 int read_next_pixel = 1;
5897 STBI_NOTUSED(tga_x_origin); // @TODO
5898 STBI_NOTUSED(tga_y_origin); // @TODO
5900 if (tga_height > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
5901 if (tga_width > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
5903 // do a tiny bit of precessing
5904 if ( tga_image_type >= 8 )
5906 tga_image_type -= 8;
5909 tga_inverted = 1 - ((tga_inverted >> 5) & 1);
5911 // If I'm paletted, then I'll use the number of bits from the palette
5912 if ( tga_indexed ) tga_comp = stbi__tga_get_comp(tga_palette_bits, 0, &tga_rgb16);
5913 else tga_comp = stbi__tga_get_comp(tga_bits_per_pixel, (tga_image_type == 3), &tga_rgb16);
5915 if(!tga_comp) // shouldn't really happen, stbi__tga_test() should have ensured basic consistency
5916 return stbi__errpuc("bad format", "Can't find out TGA pixelformat");
5921 if (comp) *comp = tga_comp;
5923 if (!stbi__mad3sizes_valid(tga_width, tga_height, tga_comp, 0))
5924 return stbi__errpuc("too large", "Corrupt TGA");
5926 tga_data = (unsigned char*)stbi__malloc_mad3(tga_width, tga_height, tga_comp, 0);
5927 if (!tga_data) return stbi__errpuc("outofmem", "Out of memory");
5929 // skip to the data's starting position (offset usually = 0)
5930 stbi__skip(s, tga_offset );
5932 if ( !tga_indexed && !tga_is_RLE && !tga_rgb16 ) {
5933 for (i=0; i < tga_height; ++i) {
5934 int row = tga_inverted ? tga_height -i - 1 : i;
5935 stbi_uc *tga_row = tga_data + row*tga_width*tga_comp;
5936 stbi__getn(s, tga_row, tga_width * tga_comp);
5939 // do I need to load a palette?
5942 if (tga_palette_len == 0) { /* you have to have at least one entry! */
5943 STBI_FREE(tga_data);
5944 return stbi__errpuc("bad palette", "Corrupt TGA");
5947 // any data to skip? (offset usually = 0)
5948 stbi__skip(s, tga_palette_start );
5950 tga_palette = (unsigned char*)stbi__malloc_mad2(tga_palette_len, tga_comp, 0);
5952 STBI_FREE(tga_data);
5953 return stbi__errpuc("outofmem", "Out of memory");
5956 stbi_uc *pal_entry = tga_palette;
5957 STBI_ASSERT(tga_comp == STBI_rgb);
5958 for (i=0; i < tga_palette_len; ++i) {
5959 stbi__tga_read_rgb16(s, pal_entry);
5960 pal_entry += tga_comp;
5962 } else if (!stbi__getn(s, tga_palette, tga_palette_len * tga_comp)) {
5963 STBI_FREE(tga_data);
5964 STBI_FREE(tga_palette);
5965 return stbi__errpuc("bad palette", "Corrupt TGA");
5969 for (i=0; i < tga_width * tga_height; ++i)
5971 // if I'm in RLE mode, do I need to get a RLE stbi__pngchunk?
5974 if ( RLE_count == 0 )
5976 // yep, get the next byte as a RLE command
5977 int RLE_cmd = stbi__get8(s);
5978 RLE_count = 1 + (RLE_cmd & 127);
5979 RLE_repeating = RLE_cmd >> 7;
5980 read_next_pixel = 1;
5981 } else if ( !RLE_repeating )
5983 read_next_pixel = 1;
5987 read_next_pixel = 1;
5989 // OK, if I need to read a pixel, do it now
5990 if ( read_next_pixel )
5992 // load however much data we did have
5995 // read in index, then perform the lookup
5996 int pal_idx = (tga_bits_per_pixel == 8) ? stbi__get8(s) : stbi__get16le(s);
5997 if ( pal_idx >= tga_palette_len ) {
6001 pal_idx *= tga_comp;
6002 for (j = 0; j < tga_comp; ++j) {
6003 raw_data[j] = tga_palette[pal_idx+j];
6005 } else if(tga_rgb16) {
6006 STBI_ASSERT(tga_comp == STBI_rgb);
6007 stbi__tga_read_rgb16(s, raw_data);
6009 // read in the data raw
6010 for (j = 0; j < tga_comp; ++j) {
6011 raw_data[j] = stbi__get8(s);
6014 // clear the reading flag for the next pixel
6015 read_next_pixel = 0;
6016 } // end of reading a pixel
6019 for (j = 0; j < tga_comp; ++j)
6020 tga_data[i*tga_comp+j] = raw_data[j];
6022 // in case we're in RLE mode, keep counting down
6025 // do I need to invert the image?
6028 for (j = 0; j*2 < tga_height; ++j)
6030 int index1 = j * tga_width * tga_comp;
6031 int index2 = (tga_height - 1 - j) * tga_width * tga_comp;
6032 for (i = tga_width * tga_comp; i > 0; --i)
6034 unsigned char temp = tga_data[index1];
6035 tga_data[index1] = tga_data[index2];
6036 tga_data[index2] = temp;
6042 // clear my palette, if I had one
6043 if ( tga_palette != NULL )
6045 STBI_FREE( tga_palette );
6049 // swap RGB - if the source data was RGB16, it already is in the right order
6050 if (tga_comp >= 3 && !tga_rgb16)
6052 unsigned char* tga_pixel = tga_data;
6053 for (i=0; i < tga_width * tga_height; ++i)
6055 unsigned char temp = tga_pixel[0];
6056 tga_pixel[0] = tga_pixel[2];
6057 tga_pixel[2] = temp;
6058 tga_pixel += tga_comp;
6062 // convert to target component count
6063 if (req_comp && req_comp != tga_comp)
6064 tga_data = stbi__convert_format(tga_data, tga_comp, req_comp, tga_width, tga_height);
6066 // the things I do to get rid of an error message, and yet keep
6067 // Microsoft's C compilers happy... [8^(
6068 tga_palette_start = tga_palette_len = tga_palette_bits =
6069 tga_x_origin = tga_y_origin = 0;
6070 STBI_NOTUSED(tga_palette_start);
6076 // *************************************************************************************************
6077 // Photoshop PSD loader -- PD by Thatcher Ulrich, integration by Nicolas Schulz, tweaked by STB
6080 static int stbi__psd_test(stbi__context *s)
6082 int r = (stbi__get32be(s) == 0x38425053);
6087 static int stbi__psd_decode_rle(stbi__context *s, stbi_uc *p, int pixelCount)
6089 int count, nleft, len;
6092 while ((nleft = pixelCount - count) > 0) {
6093 len = stbi__get8(s);
6096 } else if (len < 128) {
6097 // Copy next len+1 bytes literally.
6099 if (len > nleft) return 0; // corrupt data
6106 } else if (len > 128) {
6108 // Next -len+1 bytes in the dest are replicated from next source byte.
6109 // (Interpret len as a negative 8-bit int.)
6111 if (len > nleft) return 0; // corrupt data
6112 val = stbi__get8(s);
6125 static void *stbi__psd_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri, int bpc)
6128 int channelCount, compression;
6136 if (stbi__get32be(s) != 0x38425053) // "8BPS"
6137 return stbi__errpuc("not PSD", "Corrupt PSD image");
6139 // Check file type version.
6140 if (stbi__get16be(s) != 1)
6141 return stbi__errpuc("wrong version", "Unsupported version of PSD image");
6143 // Skip 6 reserved bytes.
6146 // Read the number of channels (R, G, B, A, etc).
6147 channelCount = stbi__get16be(s);
6148 if (channelCount < 0 || channelCount > 16)
6149 return stbi__errpuc("wrong channel count", "Unsupported number of channels in PSD image");
6151 // Read the rows and columns of the image.
6152 h = stbi__get32be(s);
6153 w = stbi__get32be(s);
6155 if (h > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
6156 if (w > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
6158 // Make sure the depth is 8 bits.
6159 bitdepth = stbi__get16be(s);
6160 if (bitdepth != 8 && bitdepth != 16)
6161 return stbi__errpuc("unsupported bit depth", "PSD bit depth is not 8 or 16 bit");
6163 // Make sure the color mode is RGB.
6164 // Valid options are:
6173 if (stbi__get16be(s) != 3)
6174 return stbi__errpuc("wrong color format", "PSD is not in RGB color format");
6176 // Skip the Mode Data. (It's the palette for indexed color; other info for other modes.)
6177 stbi__skip(s,stbi__get32be(s) );
6179 // Skip the image resources. (resolution, pen tool paths, etc)
6180 stbi__skip(s, stbi__get32be(s) );
6182 // Skip the reserved data.
6183 stbi__skip(s, stbi__get32be(s) );
6185 // Find out if the data is compressed.
6187 // 0: no compression
6188 // 1: RLE compressed
6189 compression = stbi__get16be(s);
6190 if (compression > 1)
6191 return stbi__errpuc("bad compression", "PSD has an unknown compression format");
6194 if (!stbi__mad3sizes_valid(4, w, h, 0))
6195 return stbi__errpuc("too large", "Corrupt PSD");
6197 // Create the destination image.
6199 if (!compression && bitdepth == 16 && bpc == 16) {
6200 out = (stbi_uc *) stbi__malloc_mad3(8, w, h, 0);
6201 ri->bits_per_channel = 16;
6203 out = (stbi_uc *) stbi__malloc(4 * w*h);
6205 if (!out) return stbi__errpuc("outofmem", "Out of memory");
6208 // Initialize the data to zero.
6209 //memset( out, 0, pixelCount * 4 );
6211 // Finally, the image data.
6213 // RLE as used by .PSD and .TIFF
6214 // Loop until you get the number of unpacked bytes you are expecting:
6215 // Read the next source byte into n.
6216 // If n is between 0 and 127 inclusive, copy the next n+1 bytes literally.
6217 // Else if n is between -127 and -1 inclusive, copy the next byte -n+1 times.
6218 // Else if n is 128, noop.
6221 // The RLE-compressed data is preceded by a 2-byte data count for each row in the data,
6222 // which we're going to just skip.
6223 stbi__skip(s, h * channelCount * 2 );
6225 // Read the RLE data by channel.
6226 for (channel = 0; channel < 4; channel++) {
6230 if (channel >= channelCount) {
6231 // Fill this channel with default data.
6232 for (i = 0; i < pixelCount; i++, p += 4)
6233 *p = (channel == 3 ? 255 : 0);
6235 // Read the RLE data.
6236 if (!stbi__psd_decode_rle(s, p, pixelCount)) {
6238 return stbi__errpuc("corrupt", "bad RLE data");
6244 // We're at the raw image data. It's each channel in order (Red, Green, Blue, Alpha, ...)
6245 // where each channel consists of an 8-bit (or 16-bit) value for each pixel in the image.
6247 // Read the data by channel.
6248 for (channel = 0; channel < 4; channel++) {
6249 if (channel >= channelCount) {
6250 // Fill this channel with default data.
6251 if (bitdepth == 16 && bpc == 16) {
6252 stbi__uint16 *q = ((stbi__uint16 *) out) + channel;
6253 stbi__uint16 val = channel == 3 ? 65535 : 0;
6254 for (i = 0; i < pixelCount; i++, q += 4)
6257 stbi_uc *p = out+channel;
6258 stbi_uc val = channel == 3 ? 255 : 0;
6259 for (i = 0; i < pixelCount; i++, p += 4)
6263 if (ri->bits_per_channel == 16) { // output bpc
6264 stbi__uint16 *q = ((stbi__uint16 *) out) + channel;
6265 for (i = 0; i < pixelCount; i++, q += 4)
6266 *q = (stbi__uint16) stbi__get16be(s);
6268 stbi_uc *p = out+channel;
6269 if (bitdepth == 16) { // input bpc
6270 for (i = 0; i < pixelCount; i++, p += 4)
6271 *p = (stbi_uc) (stbi__get16be(s) >> 8);
6273 for (i = 0; i < pixelCount; i++, p += 4)
6281 // remove weird white matte from PSD
6282 if (channelCount >= 4) {
6283 if (ri->bits_per_channel == 16) {
6284 for (i=0; i < w*h; ++i) {
6285 stbi__uint16 *pixel = (stbi__uint16 *) out + 4*i;
6286 if (pixel[3] != 0 && pixel[3] != 65535) {
6287 float a = pixel[3] / 65535.0f;
6288 float ra = 1.0f / a;
6289 float inv_a = 65535.0f * (1 - ra);
6290 pixel[0] = (stbi__uint16) (pixel[0]*ra + inv_a);
6291 pixel[1] = (stbi__uint16) (pixel[1]*ra + inv_a);
6292 pixel[2] = (stbi__uint16) (pixel[2]*ra + inv_a);
6296 for (i=0; i < w*h; ++i) {
6297 unsigned char *pixel = out + 4*i;
6298 if (pixel[3] != 0 && pixel[3] != 255) {
6299 float a = pixel[3] / 255.0f;
6300 float ra = 1.0f / a;
6301 float inv_a = 255.0f * (1 - ra);
6302 pixel[0] = (unsigned char) (pixel[0]*ra + inv_a);
6303 pixel[1] = (unsigned char) (pixel[1]*ra + inv_a);
6304 pixel[2] = (unsigned char) (pixel[2]*ra + inv_a);
6310 // convert to desired output format
6311 if (req_comp && req_comp != 4) {
6312 if (ri->bits_per_channel == 16)
6313 out = (stbi_uc *) stbi__convert_format16((stbi__uint16 *) out, 4, req_comp, w, h);
6315 out = stbi__convert_format(out, 4, req_comp, w, h);
6316 if (out == NULL) return out; // stbi__convert_format frees input on failure
6319 if (comp) *comp = 4;
6327 // *************************************************************************************************
6328 // Softimage PIC loader
6331 // See http://softimage.wiki.softimage.com/index.php/INFO:_PIC_file_format
6332 // See http://ozviz.wasp.uwa.edu.au/~pbourke/dataformats/softimagepic/
6335 static int stbi__pic_is4(stbi__context *s,const char *str)
6339 if (stbi__get8(s) != (stbi_uc)str[i])
6345 static int stbi__pic_test_core(stbi__context *s)
6349 if (!stbi__pic_is4(s,"\x53\x80\xF6\x34"))
6355 if (!stbi__pic_is4(s,"PICT"))
6363 stbi_uc size,type,channel;
6366 static stbi_uc *stbi__readval(stbi__context *s, int channel, stbi_uc *dest)
6370 for (i=0; i<4; ++i, mask>>=1) {
6371 if (channel & mask) {
6372 if (stbi__at_eof(s)) return stbi__errpuc("bad file","PIC file too short");
6373 dest[i]=stbi__get8(s);
6380 static void stbi__copyval(int channel,stbi_uc *dest,const stbi_uc *src)
6384 for (i=0;i<4; ++i, mask>>=1)
6389 static stbi_uc *stbi__pic_load_core(stbi__context *s,int width,int height,int *comp, stbi_uc *result)
6391 int act_comp=0,num_packets=0,y,chained;
6392 stbi__pic_packet packets[10];
6394 // this will (should...) cater for even some bizarre stuff like having data
6395 // for the same channel in multiple packets.
6397 stbi__pic_packet *packet;
6399 if (num_packets==sizeof(packets)/sizeof(packets[0]))
6400 return stbi__errpuc("bad format","too many packets");
6402 packet = &packets[num_packets++];
6404 chained = stbi__get8(s);
6405 packet->size = stbi__get8(s);
6406 packet->type = stbi__get8(s);
6407 packet->channel = stbi__get8(s);
6409 act_comp |= packet->channel;
6411 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (reading packets)");
6412 if (packet->size != 8) return stbi__errpuc("bad format","packet isn't 8bpp");
6415 *comp = (act_comp & 0x10 ? 4 : 3); // has alpha channel?
6417 for(y=0; y<height; ++y) {
6420 for(packet_idx=0; packet_idx < num_packets; ++packet_idx) {
6421 stbi__pic_packet *packet = &packets[packet_idx];
6422 stbi_uc *dest = result+y*width*4;
6424 switch (packet->type) {
6426 return stbi__errpuc("bad format","packet has bad compression type");
6428 case 0: {//uncompressed
6431 for(x=0;x<width;++x, dest+=4)
6432 if (!stbi__readval(s,packet->channel,dest))
6442 stbi_uc count,value[4];
6444 count=stbi__get8(s);
6445 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pure read count)");
6448 count = (stbi_uc) left;
6450 if (!stbi__readval(s,packet->channel,value)) return 0;
6452 for(i=0; i<count; ++i,dest+=4)
6453 stbi__copyval(packet->channel,dest,value);
6459 case 2: {//Mixed RLE
6462 int count = stbi__get8(s), i;
6463 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (mixed read count)");
6465 if (count >= 128) { // Repeated
6469 count = stbi__get16be(s);
6473 return stbi__errpuc("bad file","scanline overrun");
6475 if (!stbi__readval(s,packet->channel,value))
6478 for(i=0;i<count;++i, dest += 4)
6479 stbi__copyval(packet->channel,dest,value);
6482 if (count>left) return stbi__errpuc("bad file","scanline overrun");
6484 for(i=0;i<count;++i, dest+=4)
6485 if (!stbi__readval(s,packet->channel,dest))
6499 static void *stbi__pic_load(stbi__context *s,int *px,int *py,int *comp,int req_comp, stbi__result_info *ri)
6502 int i, x,y, internal_comp;
6505 if (!comp) comp = &internal_comp;
6507 for (i=0; i<92; ++i)
6510 x = stbi__get16be(s);
6511 y = stbi__get16be(s);
6513 if (y > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
6514 if (x > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
6516 if (stbi__at_eof(s)) return stbi__errpuc("bad file","file too short (pic header)");
6517 if (!stbi__mad3sizes_valid(x, y, 4, 0)) return stbi__errpuc("too large", "PIC image too large to decode");
6519 stbi__get32be(s); //skip `ratio'
6520 stbi__get16be(s); //skip `fields'
6521 stbi__get16be(s); //skip `pad'
6523 // intermediate buffer is RGBA
6524 result = (stbi_uc *) stbi__malloc_mad3(x, y, 4, 0);
6525 if (!result) return stbi__errpuc("outofmem", "Out of memory");
6526 memset(result, 0xff, x*y*4);
6528 if (!stbi__pic_load_core(s,x,y,comp, result)) {
6534 if (req_comp == 0) req_comp = *comp;
6535 result=stbi__convert_format(result,4,req_comp,x,y);
6540 static int stbi__pic_test(stbi__context *s)
6542 int r = stbi__pic_test_core(s);
6548 // *************************************************************************************************
6549 // GIF loader -- public domain by Jean-Marc Lienher -- simplified/shrunk by stb
6562 stbi_uc *out; // output buffer (always 4 components)
6563 stbi_uc *background; // The current "background" as far as a gif is concerned
6565 int flags, bgindex, ratio, transparent, eflags;
6566 stbi_uc pal[256][4];
6567 stbi_uc lpal[256][4];
6568 stbi__gif_lzw codes[8192];
6569 stbi_uc *color_table;
6572 int start_x, start_y;
6579 static int stbi__gif_test_raw(stbi__context *s)
6582 if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8') return 0;
6584 if (sz != '9' && sz != '7') return 0;
6585 if (stbi__get8(s) != 'a') return 0;
6589 static int stbi__gif_test(stbi__context *s)
6591 int r = stbi__gif_test_raw(s);
6596 static void stbi__gif_parse_colortable(stbi__context *s, stbi_uc pal[256][4], int num_entries, int transp)
6599 for (i=0; i < num_entries; ++i) {
6600 pal[i][2] = stbi__get8(s);
6601 pal[i][1] = stbi__get8(s);
6602 pal[i][0] = stbi__get8(s);
6603 pal[i][3] = transp == i ? 0 : 255;
6607 static int stbi__gif_header(stbi__context *s, stbi__gif *g, int *comp, int is_info)
6610 if (stbi__get8(s) != 'G' || stbi__get8(s) != 'I' || stbi__get8(s) != 'F' || stbi__get8(s) != '8')
6611 return stbi__err("not GIF", "Corrupt GIF");
6613 version = stbi__get8(s);
6614 if (version != '7' && version != '9') return stbi__err("not GIF", "Corrupt GIF");
6615 if (stbi__get8(s) != 'a') return stbi__err("not GIF", "Corrupt GIF");
6617 stbi__g_failure_reason = "";
6618 g->w = stbi__get16le(s);
6619 g->h = stbi__get16le(s);
6620 g->flags = stbi__get8(s);
6621 g->bgindex = stbi__get8(s);
6622 g->ratio = stbi__get8(s);
6623 g->transparent = -1;
6625 if (g->w > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
6626 if (g->h > STBI_MAX_DIMENSIONS) return stbi__err("too large","Very large image (corrupt?)");
6628 if (comp != 0) *comp = 4; // can't actually tell whether it's 3 or 4 until we parse the comments
6630 if (is_info) return 1;
6632 if (g->flags & 0x80)
6633 stbi__gif_parse_colortable(s,g->pal, 2 << (g->flags & 7), -1);
6638 static int stbi__gif_info_raw(stbi__context *s, int *x, int *y, int *comp)
6640 stbi__gif* g = (stbi__gif*) stbi__malloc(sizeof(stbi__gif));
6641 if (!g) return stbi__err("outofmem", "Out of memory");
6642 if (!stbi__gif_header(s, g, comp, 1)) {
6653 static void stbi__out_gif_code(stbi__gif *g, stbi__uint16 code)
6658 // recurse to decode the prefixes, since the linked-list is backwards,
6659 // and working backwards through an interleaved image would be nasty
6660 if (g->codes[code].prefix >= 0)
6661 stbi__out_gif_code(g, g->codes[code].prefix);
6663 if (g->cur_y >= g->max_y) return;
6665 idx = g->cur_x + g->cur_y;
6667 g->history[idx / 4] = 1;
6669 c = &g->color_table[g->codes[code].suffix * 4];
6670 if (c[3] > 128) { // don't render transparent pixels;
6678 if (g->cur_x >= g->max_x) {
6679 g->cur_x = g->start_x;
6680 g->cur_y += g->step;
6682 while (g->cur_y >= g->max_y && g->parse > 0) {
6683 g->step = (1 << g->parse) * g->line_size;
6684 g->cur_y = g->start_y + (g->step >> 1);
6690 static stbi_uc *stbi__process_gif_raster(stbi__context *s, stbi__gif *g)
6693 stbi__int32 len, init_code;
6695 stbi__int32 codesize, codemask, avail, oldcode, bits, valid_bits, clear;
6698 lzw_cs = stbi__get8(s);
6699 if (lzw_cs > 12) return NULL;
6700 clear = 1 << lzw_cs;
6702 codesize = lzw_cs + 1;
6703 codemask = (1 << codesize) - 1;
6706 for (init_code = 0; init_code < clear; init_code++) {
6707 g->codes[init_code].prefix = -1;
6708 g->codes[init_code].first = (stbi_uc) init_code;
6709 g->codes[init_code].suffix = (stbi_uc) init_code;
6712 // support no starting clear code
6718 if (valid_bits < codesize) {
6720 len = stbi__get8(s); // start new block
6725 bits |= (stbi__int32) stbi__get8(s) << valid_bits;
6728 stbi__int32 code = bits & codemask;
6730 valid_bits -= codesize;
6731 // @OPTIMIZE: is there some way we can accelerate the non-clear path?
6732 if (code == clear) { // clear code
6733 codesize = lzw_cs + 1;
6734 codemask = (1 << codesize) - 1;
6738 } else if (code == clear + 1) { // end of stream code
6740 while ((len = stbi__get8(s)) > 0)
6743 } else if (code <= avail) {
6745 return stbi__errpuc("no clear code", "Corrupt GIF");
6749 p = &g->codes[avail++];
6751 return stbi__errpuc("too many codes", "Corrupt GIF");
6754 p->prefix = (stbi__int16) oldcode;
6755 p->first = g->codes[oldcode].first;
6756 p->suffix = (code == avail) ? p->first : g->codes[code].first;
6757 } else if (code == avail)
6758 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
6760 stbi__out_gif_code(g, (stbi__uint16) code);
6762 if ((avail & codemask) == 0 && avail <= 0x0FFF) {
6764 codemask = (1 << codesize) - 1;
6769 return stbi__errpuc("illegal code in raster", "Corrupt GIF");
6775 // this function is designed to support animated gifs, although stb_image doesn't support it
6776 // two back is the image from two frames ago, used for a very specific disposal format
6777 static stbi_uc *stbi__gif_load_next(stbi__context *s, stbi__gif *g, int *comp, int req_comp, stbi_uc *two_back)
6783 STBI_NOTUSED(req_comp);
6785 // on first frame, any non-written pixels get the background colour (non-transparent)
6788 if (!stbi__gif_header(s, g, comp,0)) return 0; // stbi__g_failure_reason set by stbi__gif_header
6789 if (!stbi__mad3sizes_valid(4, g->w, g->h, 0))
6790 return stbi__errpuc("too large", "GIF image is too large");
6791 pcount = g->w * g->h;
6792 g->out = (stbi_uc *) stbi__malloc(4 * pcount);
6793 g->background = (stbi_uc *) stbi__malloc(4 * pcount);
6794 g->history = (stbi_uc *) stbi__malloc(pcount);
6795 if (!g->out || !g->background || !g->history)
6796 return stbi__errpuc("outofmem", "Out of memory");
6798 // image is treated as "transparent" at the start - ie, nothing overwrites the current background;
6799 // background colour is only used for pixels that are not rendered first frame, after that "background"
6800 // color refers to the color that was there the previous frame.
6801 memset(g->out, 0x00, 4 * pcount);
6802 memset(g->background, 0x00, 4 * pcount); // state of the background (starts transparent)
6803 memset(g->history, 0x00, pcount); // pixels that were affected previous frame
6806 // second frame - how do we dispose of the previous one?
6807 dispose = (g->eflags & 0x1C) >> 2;
6808 pcount = g->w * g->h;
6810 if ((dispose == 3) && (two_back == 0)) {
6811 dispose = 2; // if I don't have an image to revert back to, default to the old background
6814 if (dispose == 3) { // use previous graphic
6815 for (pi = 0; pi < pcount; ++pi) {
6816 if (g->history[pi]) {
6817 memcpy( &g->out[pi * 4], &two_back[pi * 4], 4 );
6820 } else if (dispose == 2) {
6821 // restore what was changed last frame to background before that frame;
6822 for (pi = 0; pi < pcount; ++pi) {
6823 if (g->history[pi]) {
6824 memcpy( &g->out[pi * 4], &g->background[pi * 4], 4 );
6828 // This is a non-disposal case eithe way, so just
6829 // leave the pixels as is, and they will become the new background
6830 // 1: do not dispose
6831 // 0: not specified.
6834 // background is what out is after the undoing of the previou frame;
6835 memcpy( g->background, g->out, 4 * g->w * g->h );
6838 // clear my history;
6839 memset( g->history, 0x00, g->w * g->h ); // pixels that were affected previous frame
6842 int tag = stbi__get8(s);
6844 case 0x2C: /* Image Descriptor */
6846 stbi__int32 x, y, w, h;
6849 x = stbi__get16le(s);
6850 y = stbi__get16le(s);
6851 w = stbi__get16le(s);
6852 h = stbi__get16le(s);
6853 if (((x + w) > (g->w)) || ((y + h) > (g->h)))
6854 return stbi__errpuc("bad Image Descriptor", "Corrupt GIF");
6856 g->line_size = g->w * 4;
6858 g->start_y = y * g->line_size;
6859 g->max_x = g->start_x + w * 4;
6860 g->max_y = g->start_y + h * g->line_size;
6861 g->cur_x = g->start_x;
6862 g->cur_y = g->start_y;
6864 // if the width of the specified rectangle is 0, that means
6865 // we may not see *any* pixels or the image is malformed;
6866 // to make sure this is caught, move the current y down to
6867 // max_y (which is what out_gif_code checks).
6869 g->cur_y = g->max_y;
6871 g->lflags = stbi__get8(s);
6873 if (g->lflags & 0x40) {
6874 g->step = 8 * g->line_size; // first interlaced spacing
6877 g->step = g->line_size;
6881 if (g->lflags & 0x80) {
6882 stbi__gif_parse_colortable(s,g->lpal, 2 << (g->lflags & 7), g->eflags & 0x01 ? g->transparent : -1);
6883 g->color_table = (stbi_uc *) g->lpal;
6884 } else if (g->flags & 0x80) {
6885 g->color_table = (stbi_uc *) g->pal;
6887 return stbi__errpuc("missing color table", "Corrupt GIF");
6889 o = stbi__process_gif_raster(s, g);
6890 if (!o) return NULL;
6892 // if this was the first frame,
6893 pcount = g->w * g->h;
6894 if (first_frame && (g->bgindex > 0)) {
6895 // if first frame, any pixel not drawn to gets the background color
6896 for (pi = 0; pi < pcount; ++pi) {
6897 if (g->history[pi] == 0) {
6898 g->pal[g->bgindex][3] = 255; // just in case it was made transparent, undo that; It will be reset next frame if need be;
6899 memcpy( &g->out[pi * 4], &g->pal[g->bgindex], 4 );
6907 case 0x21: // Comment Extension.
6910 int ext = stbi__get8(s);
6911 if (ext == 0xF9) { // Graphic Control Extension.
6912 len = stbi__get8(s);
6914 g->eflags = stbi__get8(s);
6915 g->delay = 10 * stbi__get16le(s); // delay - 1/100th of a second, saving as 1/1000ths.
6917 // unset old transparent
6918 if (g->transparent >= 0) {
6919 g->pal[g->transparent][3] = 255;
6921 if (g->eflags & 0x01) {
6922 g->transparent = stbi__get8(s);
6923 if (g->transparent >= 0) {
6924 g->pal[g->transparent][3] = 0;
6927 // don't need transparent
6929 g->transparent = -1;
6936 while ((len = stbi__get8(s)) != 0) {
6942 case 0x3B: // gif stream termination code
6943 return (stbi_uc *) s; // using '1' causes warning on some compilers
6946 return stbi__errpuc("unknown code", "Corrupt GIF");
6951 static void *stbi__load_gif_main_outofmem(stbi__gif *g, stbi_uc *out, int **delays)
6954 STBI_FREE(g->history);
6955 STBI_FREE(g->background);
6957 if (out) STBI_FREE(out);
6958 if (delays && *delays) STBI_FREE(*delays);
6959 return stbi__errpuc("outofmem", "Out of memory");
6962 static void *stbi__load_gif_main(stbi__context *s, int **delays, int *x, int *y, int *z, int *comp, int req_comp)
6964 if (stbi__gif_test(s)) {
6968 stbi_uc *two_back = 0;
6972 int delays_size = 0;
6974 STBI_NOTUSED(out_size);
6975 STBI_NOTUSED(delays_size);
6977 memset(&g, 0, sizeof(g));
6983 u = stbi__gif_load_next(s, &g, comp, req_comp, two_back);
6984 if (u == (stbi_uc *) s) u = 0; // end of animated gif marker
6990 stride = g.w * g.h * 4;
6993 void *tmp = (stbi_uc*) STBI_REALLOC_SIZED( out, out_size, layers * stride );
6995 return stbi__load_gif_main_outofmem(&g, out, delays);
6997 out = (stbi_uc*) tmp;
6998 out_size = layers * stride;
7002 int *new_delays = (int*) STBI_REALLOC_SIZED( *delays, delays_size, sizeof(int) * layers );
7004 return stbi__load_gif_main_outofmem(&g, out, delays);
7005 *delays = new_delays;
7006 delays_size = layers * sizeof(int);
7009 out = (stbi_uc*)stbi__malloc( layers * stride );
7011 return stbi__load_gif_main_outofmem(&g, out, delays);
7012 out_size = layers * stride;
7014 *delays = (int*) stbi__malloc( layers * sizeof(int) );
7016 return stbi__load_gif_main_outofmem(&g, out, delays);
7017 delays_size = layers * sizeof(int);
7020 memcpy( out + ((layers - 1) * stride), u, stride );
7022 two_back = out - 2 * stride;
7026 (*delays)[layers - 1U] = g.delay;
7031 // free temp buffer;
7033 STBI_FREE(g.history);
7034 STBI_FREE(g.background);
7036 // do the final conversion after loading everything;
7037 if (req_comp && req_comp != 4)
7038 out = stbi__convert_format(out, 4, req_comp, layers * g.w, g.h);
7043 return stbi__errpuc("not GIF", "Image was not as a gif type.");
7047 static void *stbi__gif_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
7051 memset(&g, 0, sizeof(g));
7054 u = stbi__gif_load_next(s, &g, comp, req_comp, 0);
7055 if (u == (stbi_uc *) s) u = 0; // end of animated gif marker
7060 // moved conversion to after successful load so that the same
7061 // can be done for multiple frames.
7062 if (req_comp && req_comp != 4)
7063 u = stbi__convert_format(u, 4, req_comp, g.w, g.h);
7065 // if there was an error and we allocated an image buffer, free it!
7069 // free buffers needed for multiple frame loading;
7070 STBI_FREE(g.history);
7071 STBI_FREE(g.background);
7076 static int stbi__gif_info(stbi__context *s, int *x, int *y, int *comp)
7078 return stbi__gif_info_raw(s,x,y,comp);
7082 // *************************************************************************************************
7083 // Radiance RGBE HDR loader
7084 // originally by Nicolas Schulz
7086 static int stbi__hdr_test_core(stbi__context *s, const char *signature)
7089 for (i=0; signature[i]; ++i)
7090 if (stbi__get8(s) != signature[i])
7096 static int stbi__hdr_test(stbi__context* s)
7098 int r = stbi__hdr_test_core(s, "#?RADIANCE\n");
7101 r = stbi__hdr_test_core(s, "#?RGBE\n");
7107 #define STBI__HDR_BUFLEN 1024
7108 static char *stbi__hdr_gettoken(stbi__context *z, char *buffer)
7113 c = (char) stbi__get8(z);
7115 while (!stbi__at_eof(z) && c != '\n') {
7117 if (len == STBI__HDR_BUFLEN-1) {
7118 // flush to end of line
7119 while (!stbi__at_eof(z) && stbi__get8(z) != '\n')
7123 c = (char) stbi__get8(z);
7130 static void stbi__hdr_convert(float *output, stbi_uc *input, int req_comp)
7132 if ( input[3] != 0 ) {
7135 f1 = (float) ldexp(1.0f, input[3] - (int)(128 + 8));
7137 output[0] = (input[0] + input[1] + input[2]) * f1 / 3;
7139 output[0] = input[0] * f1;
7140 output[1] = input[1] * f1;
7141 output[2] = input[2] * f1;
7143 if (req_comp == 2) output[1] = 1;
7144 if (req_comp == 4) output[3] = 1;
7147 case 4: output[3] = 1; /* fallthrough */
7148 case 3: output[0] = output[1] = output[2] = 0;
7150 case 2: output[1] = 1; /* fallthrough */
7151 case 1: output[0] = 0;
7157 static float *stbi__hdr_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
7159 char buffer[STBI__HDR_BUFLEN];
7166 unsigned char count, value;
7167 int i, j, k, c1,c2, z;
7168 const char *headerToken;
7172 headerToken = stbi__hdr_gettoken(s,buffer);
7173 if (strcmp(headerToken, "#?RADIANCE") != 0 && strcmp(headerToken, "#?RGBE") != 0)
7174 return stbi__errpf("not HDR", "Corrupt HDR image");
7178 token = stbi__hdr_gettoken(s,buffer);
7179 if (token[0] == 0) break;
7180 if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
7183 if (!valid) return stbi__errpf("unsupported format", "Unsupported HDR format");
7185 // Parse width and height
7186 // can't use sscanf() if we're not using stdio!
7187 token = stbi__hdr_gettoken(s,buffer);
7188 if (strncmp(token, "-Y ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
7190 height = (int) strtol(token, &token, 10);
7191 while (*token == ' ') ++token;
7192 if (strncmp(token, "+X ", 3)) return stbi__errpf("unsupported data layout", "Unsupported HDR format");
7194 width = (int) strtol(token, NULL, 10);
7196 if (height > STBI_MAX_DIMENSIONS) return stbi__errpf("too large","Very large image (corrupt?)");
7197 if (width > STBI_MAX_DIMENSIONS) return stbi__errpf("too large","Very large image (corrupt?)");
7202 if (comp) *comp = 3;
7203 if (req_comp == 0) req_comp = 3;
7205 if (!stbi__mad4sizes_valid(width, height, req_comp, sizeof(float), 0))
7206 return stbi__errpf("too large", "HDR image is too large");
7209 hdr_data = (float *) stbi__malloc_mad4(width, height, req_comp, sizeof(float), 0);
7211 return stbi__errpf("outofmem", "Out of memory");
7214 // image data is stored as some number of sca
7215 if ( width < 8 || width >= 32768) {
7217 for (j=0; j < height; ++j) {
7218 for (i=0; i < width; ++i) {
7221 stbi__getn(s, rgbe, 4);
7222 stbi__hdr_convert(hdr_data + j * width * req_comp + i * req_comp, rgbe, req_comp);
7226 // Read RLE-encoded data
7229 for (j = 0; j < height; ++j) {
7232 len = stbi__get8(s);
7233 if (c1 != 2 || c2 != 2 || (len & 0x80)) {
7234 // not run-length encoded, so we have to actually use THIS data as a decoded
7235 // pixel (note this can't be a valid pixel--one of RGB must be >= 128)
7237 rgbe[0] = (stbi_uc) c1;
7238 rgbe[1] = (stbi_uc) c2;
7239 rgbe[2] = (stbi_uc) len;
7240 rgbe[3] = (stbi_uc) stbi__get8(s);
7241 stbi__hdr_convert(hdr_data, rgbe, req_comp);
7244 STBI_FREE(scanline);
7245 goto main_decode_loop; // yes, this makes no sense
7248 len |= stbi__get8(s);
7249 if (len != width) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("invalid decoded scanline length", "corrupt HDR"); }
7250 if (scanline == NULL) {
7251 scanline = (stbi_uc *) stbi__malloc_mad2(width, 4, 0);
7253 STBI_FREE(hdr_data);
7254 return stbi__errpf("outofmem", "Out of memory");
7258 for (k = 0; k < 4; ++k) {
7261 while ((nleft = width - i) > 0) {
7262 count = stbi__get8(s);
7265 value = stbi__get8(s);
7267 if ((count == 0) || (count > nleft)) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("corrupt", "bad RLE data in HDR"); }
7268 for (z = 0; z < count; ++z)
7269 scanline[i++ * 4 + k] = value;
7272 if ((count == 0) || (count > nleft)) { STBI_FREE(hdr_data); STBI_FREE(scanline); return stbi__errpf("corrupt", "bad RLE data in HDR"); }
7273 for (z = 0; z < count; ++z)
7274 scanline[i++ * 4 + k] = stbi__get8(s);
7278 for (i=0; i < width; ++i)
7279 stbi__hdr_convert(hdr_data+(j*width + i)*req_comp, scanline + i*4, req_comp);
7282 STBI_FREE(scanline);
7288 static int stbi__hdr_info(stbi__context *s, int *x, int *y, int *comp)
7290 char buffer[STBI__HDR_BUFLEN];
7297 if (!comp) comp = &dummy;
7299 if (stbi__hdr_test(s) == 0) {
7305 token = stbi__hdr_gettoken(s,buffer);
7306 if (token[0] == 0) break;
7307 if (strcmp(token, "FORMAT=32-bit_rle_rgbe") == 0) valid = 1;
7314 token = stbi__hdr_gettoken(s,buffer);
7315 if (strncmp(token, "-Y ", 3)) {
7320 *y = (int) strtol(token, &token, 10);
7321 while (*token == ' ') ++token;
7322 if (strncmp(token, "+X ", 3)) {
7327 *x = (int) strtol(token, NULL, 10);
7331 #endif // STBI_NO_HDR
7334 static int stbi__bmp_info(stbi__context *s, int *x, int *y, int *comp)
7337 stbi__bmp_data info;
7340 p = stbi__bmp_parse_header(s, &info);
7345 if (x) *x = s->img_x;
7346 if (y) *y = s->img_y;
7348 if (info.bpp == 24 && info.ma == 0xff000000)
7351 *comp = info.ma ? 4 : 3;
7358 static int stbi__psd_info(stbi__context *s, int *x, int *y, int *comp)
7360 int channelCount, dummy, depth;
7363 if (!comp) comp = &dummy;
7364 if (stbi__get32be(s) != 0x38425053) {
7368 if (stbi__get16be(s) != 1) {
7373 channelCount = stbi__get16be(s);
7374 if (channelCount < 0 || channelCount > 16) {
7378 *y = stbi__get32be(s);
7379 *x = stbi__get32be(s);
7380 depth = stbi__get16be(s);
7381 if (depth != 8 && depth != 16) {
7385 if (stbi__get16be(s) != 3) {
7393 static int stbi__psd_is16(stbi__context *s)
7395 int channelCount, depth;
7396 if (stbi__get32be(s) != 0x38425053) {
7400 if (stbi__get16be(s) != 1) {
7405 channelCount = stbi__get16be(s);
7406 if (channelCount < 0 || channelCount > 16) {
7410 STBI_NOTUSED(stbi__get32be(s));
7411 STBI_NOTUSED(stbi__get32be(s));
7412 depth = stbi__get16be(s);
7422 static int stbi__pic_info(stbi__context *s, int *x, int *y, int *comp)
7424 int act_comp=0,num_packets=0,chained,dummy;
7425 stbi__pic_packet packets[10];
7429 if (!comp) comp = &dummy;
7431 if (!stbi__pic_is4(s,"\x53\x80\xF6\x34")) {
7438 *x = stbi__get16be(s);
7439 *y = stbi__get16be(s);
7440 if (stbi__at_eof(s)) {
7444 if ( (*x) != 0 && (1 << 28) / (*x) < (*y)) {
7452 stbi__pic_packet *packet;
7454 if (num_packets==sizeof(packets)/sizeof(packets[0]))
7457 packet = &packets[num_packets++];
7458 chained = stbi__get8(s);
7459 packet->size = stbi__get8(s);
7460 packet->type = stbi__get8(s);
7461 packet->channel = stbi__get8(s);
7462 act_comp |= packet->channel;
7464 if (stbi__at_eof(s)) {
7468 if (packet->size != 8) {
7474 *comp = (act_comp & 0x10 ? 4 : 3);
7480 // *************************************************************************************************
7481 // Portable Gray Map and Portable Pixel Map loader
7484 // PGM: http://netpbm.sourceforge.net/doc/pgm.html
7485 // PPM: http://netpbm.sourceforge.net/doc/ppm.html
7487 // Known limitations:
7488 // Does not support comments in the header section
7489 // Does not support ASCII image data (formats P2 and P3)
7493 static int stbi__pnm_test(stbi__context *s)
7496 p = (char) stbi__get8(s);
7497 t = (char) stbi__get8(s);
7498 if (p != 'P' || (t != '5' && t != '6')) {
7505 static void *stbi__pnm_load(stbi__context *s, int *x, int *y, int *comp, int req_comp, stbi__result_info *ri)
7510 ri->bits_per_channel = stbi__pnm_info(s, (int *)&s->img_x, (int *)&s->img_y, (int *)&s->img_n);
7511 if (ri->bits_per_channel == 0)
7514 if (s->img_y > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
7515 if (s->img_x > STBI_MAX_DIMENSIONS) return stbi__errpuc("too large","Very large image (corrupt?)");
7519 if (comp) *comp = s->img_n;
7521 if (!stbi__mad4sizes_valid(s->img_n, s->img_x, s->img_y, ri->bits_per_channel / 8, 0))
7522 return stbi__errpuc("too large", "PNM too large");
7524 out = (stbi_uc *) stbi__malloc_mad4(s->img_n, s->img_x, s->img_y, ri->bits_per_channel / 8, 0);
7525 if (!out) return stbi__errpuc("outofmem", "Out of memory");
7526 if (!stbi__getn(s, out, s->img_n * s->img_x * s->img_y * (ri->bits_per_channel / 8))) {
7528 return stbi__errpuc("bad PNM", "PNM file truncated");
7531 if (req_comp && req_comp != s->img_n) {
7532 if (ri->bits_per_channel == 16) {
7533 out = (stbi_uc *) stbi__convert_format16((stbi__uint16 *) out, s->img_n, req_comp, s->img_x, s->img_y);
7535 out = stbi__convert_format(out, s->img_n, req_comp, s->img_x, s->img_y);
7537 if (out == NULL) return out; // stbi__convert_format frees input on failure
7542 static int stbi__pnm_isspace(char c)
7544 return c == ' ' || c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r';
7547 static void stbi__pnm_skip_whitespace(stbi__context *s, char *c)
7550 while (!stbi__at_eof(s) && stbi__pnm_isspace(*c))
7551 *c = (char) stbi__get8(s);
7553 if (stbi__at_eof(s) || *c != '#')
7556 while (!stbi__at_eof(s) && *c != '\n' && *c != '\r' )
7557 *c = (char) stbi__get8(s);
7561 static int stbi__pnm_isdigit(char c)
7563 return c >= '0' && c <= '9';
7566 static int stbi__pnm_getinteger(stbi__context *s, char *c)
7570 while (!stbi__at_eof(s) && stbi__pnm_isdigit(*c)) {
7571 value = value*10 + (*c - '0');
7572 *c = (char) stbi__get8(s);
7573 if((value > 214748364) || (value == 214748364 && *c > '7'))
7574 return stbi__err("integer parse overflow", "Parsing an integer in the PPM header overflowed a 32-bit int");
7580 static int stbi__pnm_info(stbi__context *s, int *x, int *y, int *comp)
7587 if (!comp) comp = &dummy;
7592 p = (char) stbi__get8(s);
7593 t = (char) stbi__get8(s);
7594 if (p != 'P' || (t != '5' && t != '6')) {
7599 *comp = (t == '6') ? 3 : 1; // '5' is 1-component .pgm; '6' is 3-component .ppm
7601 c = (char) stbi__get8(s);
7602 stbi__pnm_skip_whitespace(s, &c);
7604 *x = stbi__pnm_getinteger(s, &c); // read width
7606 return stbi__err("invalid width", "PPM image header had zero or overflowing width");
7607 stbi__pnm_skip_whitespace(s, &c);
7609 *y = stbi__pnm_getinteger(s, &c); // read height
7611 return stbi__err("invalid width", "PPM image header had zero or overflowing width");
7612 stbi__pnm_skip_whitespace(s, &c);
7614 maxv = stbi__pnm_getinteger(s, &c); // read max value
7616 return stbi__err("max value > 65535", "PPM image supports only 8-bit and 16-bit images");
7617 else if (maxv > 255)
7623 static int stbi__pnm_is16(stbi__context *s)
7625 if (stbi__pnm_info(s, NULL, NULL, NULL) == 16)
7631 static int stbi__info_main(stbi__context *s, int *x, int *y, int *comp)
7633 #ifndef STBI_NO_JPEG
7634 if (stbi__jpeg_info(s, x, y, comp)) return 1;
7638 if (stbi__png_info(s, x, y, comp)) return 1;
7642 if (stbi__gif_info(s, x, y, comp)) return 1;
7646 if (stbi__bmp_info(s, x, y, comp)) return 1;
7650 if (stbi__psd_info(s, x, y, comp)) return 1;
7654 if (stbi__pic_info(s, x, y, comp)) return 1;
7658 if (stbi__pnm_info(s, x, y, comp)) return 1;
7662 if (stbi__hdr_info(s, x, y, comp)) return 1;
7665 // test tga last because it's a crappy test!
7667 if (stbi__tga_info(s, x, y, comp))
7670 return stbi__err("unknown image type", "Image not of any known type, or corrupt");
7673 static int stbi__is_16_main(stbi__context *s)
7676 if (stbi__png_is16(s)) return 1;
7680 if (stbi__psd_is16(s)) return 1;
7684 if (stbi__pnm_is16(s)) return 1;
7689 #ifndef STBI_NO_STDIO
7690 STBIDEF int stbi_info(char const *filename, int *x, int *y, int *comp)
7692 FILE *f = stbi__fopen(filename, "rb");
7694 if (!f) return stbi__err("can't fopen", "Unable to open file");
7695 result = stbi_info_from_file(f, x, y, comp);
7700 STBIDEF int stbi_info_from_file(FILE *f, int *x, int *y, int *comp)
7704 long pos = ftell(f);
7705 stbi__start_file(&s, f);
7706 r = stbi__info_main(&s,x,y,comp);
7707 fseek(f,pos,SEEK_SET);
7711 STBIDEF int stbi_is_16_bit(char const *filename)
7713 FILE *f = stbi__fopen(filename, "rb");
7715 if (!f) return stbi__err("can't fopen", "Unable to open file");
7716 result = stbi_is_16_bit_from_file(f);
7721 STBIDEF int stbi_is_16_bit_from_file(FILE *f)
7725 long pos = ftell(f);
7726 stbi__start_file(&s, f);
7727 r = stbi__is_16_main(&s);
7728 fseek(f,pos,SEEK_SET);
7731 #endif // !STBI_NO_STDIO
7733 STBIDEF int stbi_info_from_memory(stbi_uc const *buffer, int len, int *x, int *y, int *comp)
7736 stbi__start_mem(&s,buffer,len);
7737 return stbi__info_main(&s,x,y,comp);
7740 STBIDEF int stbi_info_from_callbacks(stbi_io_callbacks const *c, void *user, int *x, int *y, int *comp)
7743 stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
7744 return stbi__info_main(&s,x,y,comp);
7747 STBIDEF int stbi_is_16_bit_from_memory(stbi_uc const *buffer, int len)
7750 stbi__start_mem(&s,buffer,len);
7751 return stbi__is_16_main(&s);
7754 STBIDEF int stbi_is_16_bit_from_callbacks(stbi_io_callbacks const *c, void *user)
7757 stbi__start_callbacks(&s, (stbi_io_callbacks *) c, user);
7758 return stbi__is_16_main(&s);
7761 #endif // STB_IMAGE_IMPLEMENTATION
7765 2.20 (2019-02-07) support utf8 filenames in Windows; fix warnings and platform ifdefs
7766 2.19 (2018-02-11) fix warning
7767 2.18 (2018-01-30) fix warnings
7768 2.17 (2018-01-29) change sbti__shiftsigned to avoid clang -O2 bug
7772 2.16 (2017-07-23) all functions have 16-bit variants;
7773 STBI_NO_STDIO works again;
7775 fix rounding in unpremultiply;
7776 optimize vertical flip;
7777 disable raw_len validation;
7779 2.15 (2017-03-18) fix png-1,2,4 bug; now all Imagenet JPGs decode;
7780 warning fixes; disable run-time SSE detection on gcc;
7781 uniform handling of optional "return" values;
7782 thread-safe initialization of zlib tables
7783 2.14 (2017-03-03) remove deprecated STBI_JPEG_OLD; fixes for Imagenet JPGs
7784 2.13 (2016-11-29) add 16-bit API, only supported for PNG right now
7785 2.12 (2016-04-02) fix typo in 2.11 PSD fix that caused crashes
7786 2.11 (2016-04-02) allocate large structures on the stack
7787 remove white matting for transparent PSD
7788 fix reported channel count for PNG & BMP
7789 re-enable SSE2 in non-gcc 64-bit
7790 support RGB-formatted JPEG
7791 read 16-bit PNGs (only as 8-bit)
7792 2.10 (2016-01-22) avoid warning introduced in 2.09 by STBI_REALLOC_SIZED
7793 2.09 (2016-01-16) allow comments in PNM files
7794 16-bit-per-pixel TGA (not bit-per-component)
7795 info() for TGA could break due to .hdr handling
7796 info() for BMP to shares code instead of sloppy parse
7797 can use STBI_REALLOC_SIZED if allocator doesn't support realloc
7799 2.08 (2015-09-13) fix to 2.07 cleanup, reading RGB PSD as RGBA
7800 2.07 (2015-09-13) fix compiler warnings
7801 partial animated GIF support
7802 limited 16-bpc PSD support
7803 #ifdef unused functions
7804 bug with < 92 byte PIC,PNM,HDR,TGA
7805 2.06 (2015-04-19) fix bug where PSD returns wrong '*comp' value
7806 2.05 (2015-04-19) fix bug in progressive JPEG handling, fix warning
7807 2.04 (2015-04-15) try to re-enable SIMD on MinGW 64-bit
7808 2.03 (2015-04-12) extra corruption checking (mmozeiko)
7809 stbi_set_flip_vertically_on_load (nguillemot)
7810 fix NEON support; fix mingw support
7811 2.02 (2015-01-19) fix incorrect assert, fix warning
7812 2.01 (2015-01-17) fix various warnings; suppress SIMD on gcc 32-bit without -msse2
7813 2.00b (2014-12-25) fix STBI_MALLOC in progressive JPEG
7814 2.00 (2014-12-25) optimize JPG, including x86 SSE2 & NEON SIMD (ryg)
7815 progressive JPEG (stb)
7816 PGM/PPM support (Ken Miller)
7817 STBI_MALLOC,STBI_REALLOC,STBI_FREE
7818 GIF bugfix -- seemingly never worked
7819 STBI_NO_*, STBI_ONLY_*
7820 1.48 (2014-12-14) fix incorrectly-named assert()
7821 1.47 (2014-12-14) 1/2/4-bit PNG support, both direct and paletted (Omar Cornut & stb)
7823 fix bug in interlaced PNG with user-specified channel count (stb)
7825 fix broken tRNS chunk (colorkey-style transparency) in non-paletted PNG
7827 fix MSVC-ARM internal compiler error by wrapping malloc
7829 various warning fixes from Ronny Chevalier
7831 fix MSVC-only compiler problem in code changed in 1.42
7833 don't define _CRT_SECURE_NO_WARNINGS (affects user code)
7834 fixes to stbi__cleanup_jpeg path
7835 added STBI_ASSERT to avoid requiring assert.h
7837 fix search&replace from 1.36 that messed up comments/error messages
7839 fix gcc struct-initialization warning
7841 fix to TGA optimization when req_comp != number of components in TGA;
7842 fix to GIF loading because BMP wasn't rewinding (whoops, no GIFs in my test suite)
7843 add support for BMP version 5 (more ignored fields)
7845 suppress MSVC warnings on integer casts truncating values
7846 fix accidental rename of 'skip' field of I/O
7848 remove duplicate typedef
7850 convert to header file single-file library
7851 if de-iphone isn't set, load iphone images color-swapped instead of returning NULL
7854 fix broken STBI_SIMD path
7855 fix bug where stbi_load_from_file no longer left file pointer in correct place
7856 fix broken non-easy path for 32-bit BMP (possibly never used)
7857 TGA optimization by Arseny Kapoulkine
7859 use STBI_NOTUSED in stbi__resample_row_generic(), fix one more leak in tga failure case
7861 make stbi_is_hdr work in STBI_NO_HDR (as specified), minor compiler-friendly improvements
7863 support for "info" function for all supported filetypes (SpartanJ)
7865 a few more leak fixes, bug in PNG handling (SpartanJ)
7867 added ability to load files via callbacks to accomidate custom input streams (Ben Wenger)
7868 removed deprecated format-specific test/load functions
7869 removed support for installable file formats (stbi_loader) -- would have been broken for IO callbacks anyway
7870 error cases in bmp and tga give messages and don't leak (Raymond Barbiero, grisha)
7871 fix inefficiency in decoding 32-bit BMP (David Woo)
7873 various warning fixes from Aurelien Pocheville
7875 fix bug in GIF palette transparency (SpartanJ)
7877 cast-to-stbi_uc to fix warnings
7879 fix bug in file buffering for PNG reported by SpartanJ
7881 refix trans_data warning (Won Chun)
7883 perf improvements reading from files on platforms with lock-heavy fgetc()
7884 minor perf improvements for jpeg
7885 deprecated type-specific functions so we'll get feedback if they're needed
7886 attempt to fix trans_data warning (Won Chun)
7887 1.23 fixed bug in iPhone support
7889 removed image *writing* support
7890 stbi_info support from Jetro Lauha
7891 GIF support from Jean-Marc Lienher
7892 iPhone PNG-extensions from James Brown
7893 warning-fixes from Nicolas Schulz and Janez Zemva (i.stbi__err. Janez (U+017D)emva)
7894 1.21 fix use of 'stbi_uc' in header (reported by jon blow)
7895 1.20 added support for Softimage PIC, by Tom Seddon
7896 1.19 bug in interlaced PNG corruption check (found by ryg)
7898 fix a threading bug (local mutable static)
7899 1.17 support interlaced PNG
7900 1.16 major bugfix - stbi__convert_format converted one too many pixels
7901 1.15 initialize some fields for thread safety
7902 1.14 fix threadsafe conversion bug
7903 header-file-only version (#define STBI_HEADER_FILE_ONLY before including)
7905 1.12 const qualifiers in the API
7906 1.11 Support installable IDCT, colorspace conversion routines
7907 1.10 Fixes for 64-bit (don't use "unsigned long")
7908 optimized upsampling by Fabian "ryg" Giesen
7909 1.09 Fix format-conversion for PSD code (bad global variables!)
7910 1.08 Thatcher Ulrich's PSD code integrated by Nicolas Schulz
7911 1.07 attempt to fix C++ warning/errors again
7912 1.06 attempt to fix C++ warning/errors again
7913 1.05 fix TGA loading to return correct *comp and use good luminance calc
7914 1.04 default float alpha is 1, not 255; use 'void *' for stbi_image_free
7915 1.03 bugfixes to STBI_NO_STDIO, STBI_NO_HDR
7916 1.02 support for (subset of) HDR files, float interface for preferred access to them
7917 1.01 fix bug: possible bug in handling right-side up bmps... not sure
7918 fix bug: the stbi__bmp_load() and stbi__tga_load() functions didn't work at all
7919 1.00 interface to zlib that skips zlib header
7920 0.99 correct handling of alpha in palette
7921 0.98 TGA loader by lonesock; dynamically add loaders (untested)
7922 0.97 jpeg errors on too large a file; also catch another malloc failure
7923 0.96 fix detection of invalid v value - particleman@mollyrocket forum
7924 0.95 during header scan, seek to markers in case of padding
7925 0.94 STBI_NO_STDIO to disable stdio usage; rename all #defines the same
7926 0.93 handle jpegtran output; verbose errors
7927 0.92 read 4,8,16,24,32-bit BMP files of several formats
7928 0.91 output 24-bit Windows 3.0 BMP files
7929 0.90 fix a few more warnings; bump version number to approach 1.0
7930 0.61 bugfixes due to Marc LeBlanc, Christopher Lloyd
7931 0.60 fix compiling as c++
7932 0.59 fix warnings: merge Dave Moore's -Wall fixes
7933 0.58 fix bug: zlib uncompressed mode len/nlen was wrong endian
7934 0.57 fix bug: jpg last huffman symbol before marker was >9 bits but less than 16 available
7935 0.56 fix bug: zlib uncompressed mode len vs. nlen
7936 0.55 fix bug: restart_interval not initialized to 0
7937 0.54 allow NULL for 'int *comp'
7938 0.53 fix bug in png 3->4; speedup png decoding
7939 0.52 png handles req_comp=3,4 directly; minor cleanup; jpeg comments
7940 0.51 obey req_comp requests, 1-component jpegs return as 1-component,
7941 on 'test' only check type, not whether we support this variant
7943 first released version
7948 ------------------------------------------------------------------------------
7949 This software is available under 2 licenses -- choose whichever you prefer.
7950 ------------------------------------------------------------------------------
7951 ALTERNATIVE A - MIT License
7952 Copyright (c) 2017 Sean Barrett
7953 Permission is hereby granted, free of charge, to any person obtaining a copy of
7954 this software and associated documentation files (the "Software"), to deal in
7955 the Software without restriction, including without limitation the rights to
7956 use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
7957 of the Software, and to permit persons to whom the Software is furnished to do
7958 so, subject to the following conditions:
7959 The above copyright notice and this permission notice shall be included in all
7960 copies or substantial portions of the Software.
7961 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
7962 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
7963 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
7964 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
7965 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
7966 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
7968 ------------------------------------------------------------------------------
7969 ALTERNATIVE B - Public Domain (www.unlicense.org)
7970 This is free and unencumbered software released into the public domain.
7971 Anyone is free to copy, modify, publish, use, compile, sell, or distribute this
7972 software, either in source code form or as a compiled binary, for any purpose,
7973 commercial or non-commercial, and by any means.
7974 In jurisdictions that recognize copyright laws, the author or authors of this
7975 software dedicate any and all copyright interest in the software to the public
7976 domain. We make this dedication for the benefit of the public at large and to
7977 the detriment of our heirs and successors. We intend this dedication to be an
7978 overt act of relinquishment in perpetuity of all present and future rights to
7979 this software under copyright law.
7980 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
7981 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
7982 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
7983 AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
7984 ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
7985 WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
7986 ------------------------------------------------------------------------------