Check inside the deserializer that const atoms are indeed const, don't
trust the input. The serializer only writes type 0 records for const
atoms but the byte stream may have been corrupted or manipulated.
Overlooked during review of c25aad7 ("Add ability to (de)serialize
symbols")
Found with libfuzzer and it found it _really_ fast. Great tool.
Before this commit it segfaulted, now it throws a SyntaxError.
That's still not correct behavior but better than segfaulting.
To be continued.
Includes a small run-test262 fix to handle Windows line endings.
Refs: https://github.com/quickjs-ng/quickjs/issues/567
`JS_NewClassID(rt, &class_id)` where `class_id` is a global variable
is unsafe when called from multiple threads but that is exactly what
quickjs-libc.c did.
Add a new JS_AddRuntimeFinalizer function that lets quickjs-libc
store the class ids in JSRuntimeState and defer freeing the memory
until the runtime is destroyed. Necessary because object finalizers
such as js_std_file_finalizer need to know the class id and run after
js_std_free_handlers runs.
Fixes: https://github.com/quickjs-ng/quickjs/issues/577
Make sure the one set in the malloc functions is used rather than the
default one, since it will likely use a different allocator.
For some reason, this didn't cause a problem on macOS, but it does in
Linux. Opsie! Added some CI to prevent these kinds of bugs.
After 56da486312 it's possible existing
code relied on the current exception not being null to dump it, and the
dumped value just said "[unsupported type]". This change provides a more
descriptive value.
Rather than having the user take care of JSMallocState, take care of the
bookkeeping internally (and make JSMallocState non-public since it's no
longer necessary) and keep the allocation functions to the bare minimum.
This has the advantage that using a different allocator is just a few
lines of code, and there is no need to copy the default implementation
just to moficy the call to the allocation function.
Fixes: https://github.com/quickjs-ng/quickjs/issues/285
The spec says HostMakeJobCallback has to be used on the callback: https://tc39.es/ecma262/multipage/managing-memory.html#sec-finalization-registry-cleanup-callback
That makes the following (arguably contrived) example run forever until
memory is exhausted.
```js
let count = 0;
function main() {
console.log(`main! ${++count}`);
const registry = new FinalizationRegistry(() => {
globalThis.foo = main();
});
registry.register([]);
registry.register([]);
return registry;
}
main();
console.log(count);
```
That is unlike V8, which runs 0 times. This can be explained by the
difference in GC implementations and since FinRec makes GC observable,
here we are!
Fixes: https://github.com/quickjs-ng/quickjs/issues/432
Analogously to JS_WriteObject2, it allows the user to get a tab with all
the SAB objects that were read.
This can help adjust reference counts in a scenario where a SAB that was
written increased it and it's necessary to decrease it upon reading it.
- no longer pass the array length to `utf8_decode`
- add `utf8_decode_len` for border cases
- use switch based dispatch in `utf8_decode_len` to work around a gcc 12.2 optimizer bug
integer conversions:
- improve `u32toa_radix` and `u64toa_radix`, add `i32toa_radix`
- use `i32toa_radix` for small ints in `js_number_toString`
floating point conversions (`js_dtoa`):
- complete rewrite with fewer calls to `snprintf`
- remove `JS_DTOA_FORMAT`, define 4 possible modes for `js_dtoa`
- remove the radix argument in `js_dtoa`
- merge `js_dtoa1` into `js_dtoa`
- add `js_dtoa_infinite` for non finite values
- simplify sign handling
- handle locale specific decimal point transparently
helper function `js_fcvt`:
- simplify `js_fcvt`, remove `js_fcvt1`, reduce overhead
- round up manually instead of using `fesetround(FE_UPWARD)`.
helper function `js_ecvt`:
- document `js_ecvt` and `js_ecvt1` behavior
- avoid redundant `js_ecvt1` calls in `js_ecvt`
- fixed buffer contents, no buffer copies
- simplify decimal point handling
- round up manually instead of using `fesetround(FE_UPWARD)`.
miscellaneous:
- remove `CONFIG_PRINTF_RNDN`. This fixes some of the conversion errors
on Windows. Updated the tests accordingly
- this fixes a v8.sh bug on macOS: `0.5.toFixed(0)` used to produce `0` instead of `1`
- add regression tests, update test_conv unit tests
- add benchmarks for `toFixed`, `toPrecision` and `toExponential` number methods
- benchmarks show all conversions are now 40 to 45% faster (M2)
- use single test in `js_strtod` loop.
- use more explicit `ATOD_xxx` flags
- remove `ATOD_TYPE_MASK`, use `ATOD_WANT_BIG_INT` instead
- remove unused arguments `flags` and `pexponent` in `js_string_to_bigint`
- merge `js_atof` and `js_atof2`, remove `slimb_t *pexponent` argument
- simplify and document `js_atof` parser, remove cumbersome labels,
- simplify `js_parseInt` test for zero radix for `ATOD_ACCEPT_HEX_PREFIX`
- simplify `next_token` number parsing, handle legacy octal in parser only
- simplify `JS_StringToBigInt`, use flags only.
- remove unused `slimb_t exponent` token field
- add number syntax tests
Ensure proper UTF-8 encoding (1 to 4 bytes).
Handle invalid encodings (return 0xFFFD and consume a single byte)
Individually encoded surrogate code points are accepted.
- add `utf8_scan()` to analyze a byte array for UTF-8 contents
detects invalid encoding, computes number of codepoints and content kind:
plain ASCII, 8-bit, 16-bit or larger codepoints.
- add `utf8_encode_len(c)` to compute the number of bytes to encode `c`
- rename `unicode_to_utf8` as `utf8_encode`
- rename `unicode_from_utf8` as `utf8_decode`
- add `utf8_decode_buf8(dest, size, src, len)` to decode a UTF-8 encoded
byte array known to contain only ASCII and 8-bit codepoints.
- add `utf8_decode_buf16(dest, size, src, len)` to decode a UTF-8 encoded
byte array into an array of 16-bit codepoints using UTF-16 surrogate pairs
for non-BMP1 codepoints.
- add `utf8_encode_buf8(dest, size, src, len)` to encode an array of 8-bit
codepoints as a UTF-8 encoded null terminated string
- add `utf16_encode_buf8(dest, size, src, len)` to decode an array of 16-bit
codepoints (including surrogate pairs) as a UTF-8 encoded null terminated string
- detect invalid UTF-8 encoding in RegExp parser
- simplify `JS_AtomGetStrRT`, `JS_NewStringLen` using the above functions
- simplify UTF-8 decoding and error testing
- output more informative error messages in `js_parse_expect`.
The previous code was bogus:
```
return js_parse_error(s, "expecting '%c'", tok);
```
this was causing a bug on `eval("do;")` where `tok` is `TOK_WHILE` (-70, 0xBA)
creating an invalid UTF-8 encoding (lone trailing byte).
This would ultimately have caused a failure in `JS_ThrowError2` if `JS_NewString`
failed when converting the error message to a string if the conversion detected the invalid
UTF-8 encoding and throwed an error (it currently does not, but should).
- test for `JS_NewString` failure in `JS_ThrowError2`
- test for `JS_FreeCString` failure in run-test262.c
- add more test cases
String values are allocated as temporary or final results. This commit
attempts to improve the consistency and performance of this step.
- define `JS_NewString` as an inline function to allow simple expansion
of `strlen()` for string literals
- document string contents constraints regarding UTF-8 encoding.
- rename `js_new_string8` as `js_new_string8_len`. takes `const char *`.
- new inline function `js_new_string8` takes `const char *`, computes
string length with `strlen` and calls `js_new_string8_len`. No overhead
for string literals
- rename `js_new_string16` to `js_new_string16_len`
- use internal string allocation functions where appropriate, remove overhead
- allocate extra byte for null terminator in source code string
- fix radix conversion rounding code: incrementing the digit
does not work for '9'. We can assume ASCII so it works for
all other digits, especially all letters
- also avoid recomputing the string length
* Add utility functions, improve integer conversion functions
- move `is_be()` to cutils.h
- add `is_upper_ascii()` and `to_upper_ascii()`
- add extensive benchmark for integer conversion variants in **tests/test_conv.c**
- add `u32toa()`, `i32toa()`, `u64toa()`, `i64toa()` based on register shift variant
- add `u32toa_radix()`, `u64toa_radix()`, `i64toa_radix()` based on length_loop variant
- use direct converters instead of `snprintf()`
- copy NaN and Infinity directly in `js_dtoa1()`
- optimize `js_number_toString()` for small integers
- use `JS_NewStringLen()` instead of `JS_NewString()` when possible
- add more precise conversion tests in microbench.js
- disable some benchmark tests for gcc (they cause ASAN failures)
- `-s` strips the source code
- `-ss` strips source and line/column numbers information
- `qjsc repl.js` generates an object size of **105726** bytes
- `qjsc -s repl.js` generates an object size of **20853** bytes
- `qjsc -ss repl.js` generates an object size of only **16147** bytes
- compile repl.js with `-ss`
- bump byte code version to 12
* Fix member accesses for non-decimal numeric literals
e.g. 0x0.a should return undefined, not SyntaxError.
* Remove ineffective non-decimal float parsing code and redundant checks on `is_float && radix != 10`
(The code already wasn't doing anything because of the `is_float` check.)
- improve `JS_DumpString`: use `L` prefix for wide strings
- dump variable kind and flags for locals and closures
- disassemble byte code in DUMP_READ_OBJECT
- pass start_pos to `dump_byte_code` and `dump_single_byte_code`
- write constant pool before function bytecode (bump version to 11)
- update generated code
* Expose public equality comparison and sameness public API.
- add `JS_IsEqual` (operator `==`), returns an `int`: `-1` if an exception was thrown
- add `JS_IsStrictEqual` (operator `===`) always succeeds, returns a `JS_BOOL`
- add `JS_IsSameValue` always succeeds, returns a `JS_BOOL`
- add `JS_IsSameValueZero` always succeeds, returns a `JS_BOOL`
- DUMP_XXX defined as nothing or 0 produces unconditional output
- DUMP_XXX defined as a bitmask produces conditional output based
on command line option -d<bitmask>
- add `JS_SetDumpFlags()` to select active dump options
- accept -d[<hex mask>] and --dump[=<hex mask>] to specify active
dump options, generalize command line option handling
- improve DUMP_READ_OBJECT output, fix indentation issue
In the pathological case shown in
https://github.com/quickjs-ng/quickjs/issues/367 both the object and the
registry will be destroyed as part of the GC phase of JS_FreeRuntime.
When the GC sweep happens it's possible we are holding on to a corpse so
avoid calling the registry callback in that case.
This is similar to how Weak{Map,Set} deal with iterators being freed as
part of a cycle.
Fixes: https://github.com/quickjs-ng/quickjs/issues/367
- avoid crashing on invalid atoms in `JS_AtomGetStrRT`
- do not dump objects and function_bytecode during
`JS_GC_PHASE_REMOVE_CYCLES` phase
- fix crash in `print_lines` on null source