Deserialized bytecode does not have IC state, i.e., `bc->ic == NULL`.
That may or may not be bug (IMO, it is and we should rebuild the
IC state during deserialization) but, either way, don't segfault.
DRY add_ic_slot() and its call sites in a hopefully NFC manner.
Don't raise a "invalid tag 12" exception when encountering bytecode
and JS_READ_OBJ_BYTECODE is not set, because no one knows what "tag 12"
means without looking it up, not even quickjs maintainers.
Don't store the update flag in the IC because that's a) an out-of-band
signalling mechanism, and b) makes JSInlineCache bigger than it needs
to be. One is allocated per function so it adds up.
Another reason for making this change is that it makes visible what
I strongly suspect are bugs in the original implementation.
Otherwise it's too easy to tie up too many resources (cpu, memory) by
crafting inputs with a very large atom count (up to 4 billion.)
This may need some finetuning. If the limit proves too restrictive for
very large snapshots, we can make it relative to the size of the input.
Check inside the deserializer that const atoms are indeed const, don't
trust the input. The serializer only writes type 0 records for const
atoms but the byte stream may have been corrupted or manipulated.
Overlooked during review of c25aad7 ("Add ability to (de)serialize
symbols")
Found with libfuzzer and it found it _really_ fast. Great tool.
Before this commit it segfaulted, now it throws a SyntaxError.
That's still not correct behavior but better than segfaulting.
To be continued.
Includes a small run-test262 fix to handle Windows line endings.
Refs: https://github.com/quickjs-ng/quickjs/issues/567
`JS_NewClassID(rt, &class_id)` where `class_id` is a global variable
is unsafe when called from multiple threads but that is exactly what
quickjs-libc.c did.
Add a new JS_AddRuntimeFinalizer function that lets quickjs-libc
store the class ids in JSRuntimeState and defer freeing the memory
until the runtime is destroyed. Necessary because object finalizers
such as js_std_file_finalizer need to know the class id and run after
js_std_free_handlers runs.
Fixes: https://github.com/quickjs-ng/quickjs/issues/577
Make sure the one set in the malloc functions is used rather than the
default one, since it will likely use a different allocator.
For some reason, this didn't cause a problem on macOS, but it does in
Linux. Opsie! Added some CI to prevent these kinds of bugs.
After 56da486312 it's possible existing
code relied on the current exception not being null to dump it, and the
dumped value just said "[unsupported type]". This change provides a more
descriptive value.
Rather than having the user take care of JSMallocState, take care of the
bookkeeping internally (and make JSMallocState non-public since it's no
longer necessary) and keep the allocation functions to the bare minimum.
This has the advantage that using a different allocator is just a few
lines of code, and there is no need to copy the default implementation
just to moficy the call to the allocation function.
Fixes: https://github.com/quickjs-ng/quickjs/issues/285
The spec says HostMakeJobCallback has to be used on the callback: https://tc39.es/ecma262/multipage/managing-memory.html#sec-finalization-registry-cleanup-callback
That makes the following (arguably contrived) example run forever until
memory is exhausted.
```js
let count = 0;
function main() {
console.log(`main! ${++count}`);
const registry = new FinalizationRegistry(() => {
globalThis.foo = main();
});
registry.register([]);
registry.register([]);
return registry;
}
main();
console.log(count);
```
That is unlike V8, which runs 0 times. This can be explained by the
difference in GC implementations and since FinRec makes GC observable,
here we are!
Fixes: https://github.com/quickjs-ng/quickjs/issues/432
Analogously to JS_WriteObject2, it allows the user to get a tab with all
the SAB objects that were read.
This can help adjust reference counts in a scenario where a SAB that was
written increased it and it's necessary to decrease it upon reading it.
- no longer pass the array length to `utf8_decode`
- add `utf8_decode_len` for border cases
- use switch based dispatch in `utf8_decode_len` to work around a gcc 12.2 optimizer bug
integer conversions:
- improve `u32toa_radix` and `u64toa_radix`, add `i32toa_radix`
- use `i32toa_radix` for small ints in `js_number_toString`
floating point conversions (`js_dtoa`):
- complete rewrite with fewer calls to `snprintf`
- remove `JS_DTOA_FORMAT`, define 4 possible modes for `js_dtoa`
- remove the radix argument in `js_dtoa`
- merge `js_dtoa1` into `js_dtoa`
- add `js_dtoa_infinite` for non finite values
- simplify sign handling
- handle locale specific decimal point transparently
helper function `js_fcvt`:
- simplify `js_fcvt`, remove `js_fcvt1`, reduce overhead
- round up manually instead of using `fesetround(FE_UPWARD)`.
helper function `js_ecvt`:
- document `js_ecvt` and `js_ecvt1` behavior
- avoid redundant `js_ecvt1` calls in `js_ecvt`
- fixed buffer contents, no buffer copies
- simplify decimal point handling
- round up manually instead of using `fesetround(FE_UPWARD)`.
miscellaneous:
- remove `CONFIG_PRINTF_RNDN`. This fixes some of the conversion errors
on Windows. Updated the tests accordingly
- this fixes a v8.sh bug on macOS: `0.5.toFixed(0)` used to produce `0` instead of `1`
- add regression tests, update test_conv unit tests
- add benchmarks for `toFixed`, `toPrecision` and `toExponential` number methods
- benchmarks show all conversions are now 40 to 45% faster (M2)
- use single test in `js_strtod` loop.
- use more explicit `ATOD_xxx` flags
- remove `ATOD_TYPE_MASK`, use `ATOD_WANT_BIG_INT` instead
- remove unused arguments `flags` and `pexponent` in `js_string_to_bigint`
- merge `js_atof` and `js_atof2`, remove `slimb_t *pexponent` argument
- simplify and document `js_atof` parser, remove cumbersome labels,
- simplify `js_parseInt` test for zero radix for `ATOD_ACCEPT_HEX_PREFIX`
- simplify `next_token` number parsing, handle legacy octal in parser only
- simplify `JS_StringToBigInt`, use flags only.
- remove unused `slimb_t exponent` token field
- add number syntax tests
Ensure proper UTF-8 encoding (1 to 4 bytes).
Handle invalid encodings (return 0xFFFD and consume a single byte)
Individually encoded surrogate code points are accepted.
- add `utf8_scan()` to analyze a byte array for UTF-8 contents
detects invalid encoding, computes number of codepoints and content kind:
plain ASCII, 8-bit, 16-bit or larger codepoints.
- add `utf8_encode_len(c)` to compute the number of bytes to encode `c`
- rename `unicode_to_utf8` as `utf8_encode`
- rename `unicode_from_utf8` as `utf8_decode`
- add `utf8_decode_buf8(dest, size, src, len)` to decode a UTF-8 encoded
byte array known to contain only ASCII and 8-bit codepoints.
- add `utf8_decode_buf16(dest, size, src, len)` to decode a UTF-8 encoded
byte array into an array of 16-bit codepoints using UTF-16 surrogate pairs
for non-BMP1 codepoints.
- add `utf8_encode_buf8(dest, size, src, len)` to encode an array of 8-bit
codepoints as a UTF-8 encoded null terminated string
- add `utf16_encode_buf8(dest, size, src, len)` to decode an array of 16-bit
codepoints (including surrogate pairs) as a UTF-8 encoded null terminated string
- detect invalid UTF-8 encoding in RegExp parser
- simplify `JS_AtomGetStrRT`, `JS_NewStringLen` using the above functions
- simplify UTF-8 decoding and error testing
- output more informative error messages in `js_parse_expect`.
The previous code was bogus:
```
return js_parse_error(s, "expecting '%c'", tok);
```
this was causing a bug on `eval("do;")` where `tok` is `TOK_WHILE` (-70, 0xBA)
creating an invalid UTF-8 encoding (lone trailing byte).
This would ultimately have caused a failure in `JS_ThrowError2` if `JS_NewString`
failed when converting the error message to a string if the conversion detected the invalid
UTF-8 encoding and throwed an error (it currently does not, but should).
- test for `JS_NewString` failure in `JS_ThrowError2`
- test for `JS_FreeCString` failure in run-test262.c
- add more test cases
String values are allocated as temporary or final results. This commit
attempts to improve the consistency and performance of this step.
- define `JS_NewString` as an inline function to allow simple expansion
of `strlen()` for string literals
- document string contents constraints regarding UTF-8 encoding.
- rename `js_new_string8` as `js_new_string8_len`. takes `const char *`.
- new inline function `js_new_string8` takes `const char *`, computes
string length with `strlen` and calls `js_new_string8_len`. No overhead
for string literals
- rename `js_new_string16` to `js_new_string16_len`
- use internal string allocation functions where appropriate, remove overhead
- allocate extra byte for null terminator in source code string
- fix radix conversion rounding code: incrementing the digit
does not work for '9'. We can assume ASCII so it works for
all other digits, especially all letters
- also avoid recomputing the string length
* Add utility functions, improve integer conversion functions
- move `is_be()` to cutils.h
- add `is_upper_ascii()` and `to_upper_ascii()`
- add extensive benchmark for integer conversion variants in **tests/test_conv.c**
- add `u32toa()`, `i32toa()`, `u64toa()`, `i64toa()` based on register shift variant
- add `u32toa_radix()`, `u64toa_radix()`, `i64toa_radix()` based on length_loop variant
- use direct converters instead of `snprintf()`
- copy NaN and Infinity directly in `js_dtoa1()`
- optimize `js_number_toString()` for small integers
- use `JS_NewStringLen()` instead of `JS_NewString()` when possible
- add more precise conversion tests in microbench.js
- disable some benchmark tests for gcc (they cause ASAN failures)
- `-s` strips the source code
- `-ss` strips source and line/column numbers information
- `qjsc repl.js` generates an object size of **105726** bytes
- `qjsc -s repl.js` generates an object size of **20853** bytes
- `qjsc -ss repl.js` generates an object size of only **16147** bytes
- compile repl.js with `-ss`
- bump byte code version to 12