Ensure proper UTF-8 encoding (1 to 4 bytes).
Handle invalid encodings (return 0xFFFD and consume a single byte)
Individually encoded surrogate code points are accepted.
- add `utf8_scan()` to analyze a byte array for UTF-8 contents
detects invalid encoding, computes number of codepoints and content kind:
plain ASCII, 8-bit, 16-bit or larger codepoints.
- add `utf8_encode_len(c)` to compute the number of bytes to encode `c`
- rename `unicode_to_utf8` as `utf8_encode`
- rename `unicode_from_utf8` as `utf8_decode`
- add `utf8_decode_buf8(dest, size, src, len)` to decode a UTF-8 encoded
byte array known to contain only ASCII and 8-bit codepoints.
- add `utf8_decode_buf16(dest, size, src, len)` to decode a UTF-8 encoded
byte array into an array of 16-bit codepoints using UTF-16 surrogate pairs
for non-BMP1 codepoints.
- add `utf8_encode_buf8(dest, size, src, len)` to encode an array of 8-bit
codepoints as a UTF-8 encoded null terminated string
- add `utf16_encode_buf8(dest, size, src, len)` to decode an array of 16-bit
codepoints (including surrogate pairs) as a UTF-8 encoded null terminated string
- detect invalid UTF-8 encoding in RegExp parser
- simplify `JS_AtomGetStrRT`, `JS_NewStringLen` using the above functions
- simplify UTF-8 decoding and error testing
- output more informative error messages in `js_parse_expect`.
The previous code was bogus:
```
return js_parse_error(s, "expecting '%c'", tok);
```
this was causing a bug on `eval("do;")` where `tok` is `TOK_WHILE` (-70, 0xBA)
creating an invalid UTF-8 encoding (lone trailing byte).
This would ultimately have caused a failure in `JS_ThrowError2` if `JS_NewString`
failed when converting the error message to a string if the conversion detected the invalid
UTF-8 encoding and throwed an error (it currently does not, but should).
- test for `JS_NewString` failure in `JS_ThrowError2`
- test for `JS_FreeCString` failure in run-test262.c
- add more test cases
String values are allocated as temporary or final results. This commit
attempts to improve the consistency and performance of this step.
- define `JS_NewString` as an inline function to allow simple expansion
of `strlen()` for string literals
- document string contents constraints regarding UTF-8 encoding.
- rename `js_new_string8` as `js_new_string8_len`. takes `const char *`.
- new inline function `js_new_string8` takes `const char *`, computes
string length with `strlen` and calls `js_new_string8_len`. No overhead
for string literals
- rename `js_new_string16` to `js_new_string16_len`
- use internal string allocation functions where appropriate, remove overhead
- allocate extra byte for null terminator in source code string
- fix radix conversion rounding code: incrementing the digit
does not work for '9'. We can assume ASCII so it works for
all other digits, especially all letters
- also avoid recomputing the string length
* Add utility functions, improve integer conversion functions
- move `is_be()` to cutils.h
- add `is_upper_ascii()` and `to_upper_ascii()`
- add extensive benchmark for integer conversion variants in **tests/test_conv.c**
- add `u32toa()`, `i32toa()`, `u64toa()`, `i64toa()` based on register shift variant
- add `u32toa_radix()`, `u64toa_radix()`, `i64toa_radix()` based on length_loop variant
- use direct converters instead of `snprintf()`
- copy NaN and Infinity directly in `js_dtoa1()`
- optimize `js_number_toString()` for small integers
- use `JS_NewStringLen()` instead of `JS_NewString()` when possible
- add more precise conversion tests in microbench.js
- disable some benchmark tests for gcc (they cause ASAN failures)
- `-s` strips the source code
- `-ss` strips source and line/column numbers information
- `qjsc repl.js` generates an object size of **105726** bytes
- `qjsc -s repl.js` generates an object size of **20853** bytes
- `qjsc -ss repl.js` generates an object size of only **16147** bytes
- compile repl.js with `-ss`
- bump byte code version to 12
* Fix member accesses for non-decimal numeric literals
e.g. 0x0.a should return undefined, not SyntaxError.
* Remove ineffective non-decimal float parsing code and redundant checks on `is_float && radix != 10`
(The code already wasn't doing anything because of the `is_float` check.)
- improve `JS_DumpString`: use `L` prefix for wide strings
- dump variable kind and flags for locals and closures
- disassemble byte code in DUMP_READ_OBJECT
- pass start_pos to `dump_byte_code` and `dump_single_byte_code`
- write constant pool before function bytecode (bump version to 11)
- update generated code
* Expose public equality comparison and sameness public API.
- add `JS_IsEqual` (operator `==`), returns an `int`: `-1` if an exception was thrown
- add `JS_IsStrictEqual` (operator `===`) always succeeds, returns a `JS_BOOL`
- add `JS_IsSameValue` always succeeds, returns a `JS_BOOL`
- add `JS_IsSameValueZero` always succeeds, returns a `JS_BOOL`
- DUMP_XXX defined as nothing or 0 produces unconditional output
- DUMP_XXX defined as a bitmask produces conditional output based
on command line option -d<bitmask>
- add `JS_SetDumpFlags()` to select active dump options
- accept -d[<hex mask>] and --dump[=<hex mask>] to specify active
dump options, generalize command line option handling
- improve DUMP_READ_OBJECT output, fix indentation issue
In the pathological case shown in
https://github.com/quickjs-ng/quickjs/issues/367 both the object and the
registry will be destroyed as part of the GC phase of JS_FreeRuntime.
When the GC sweep happens it's possible we are holding on to a corpse so
avoid calling the registry callback in that case.
This is similar to how Weak{Map,Set} deal with iterators being freed as
part of a cycle.
Fixes: https://github.com/quickjs-ng/quickjs/issues/367
- avoid crashing on invalid atoms in `JS_AtomGetStrRT`
- do not dump objects and function_bytecode during
`JS_GC_PHASE_REMOVE_CYCLES` phase
- fix crash in `print_lines` on null source
* Optimize `JS_GetPropertyInt64` and `JS_TryGetPropertyInt64`
- add `js_get_fast_array_element()` to special case arrays and typed arrays
- use `js_get_fast_array_element()` in `JS_GetPropertyValue()`,
`JS_TryGetPropertyInt64()` and `JS_GetPropertyInt64()`.
- simplify `js_array_at()`
- change error message for `Object.create` invalid property descriptor
- disable v8 test cases for deprecated legacy RegExp static properties
and invalid left hand side error type
- update v8.txt
- fix v8.sh behavior for single tests
Translate IC opcodes to their non-IC variants before writing them out.
Before this commit they were not byte-swapped properly, breaking the
ability to load serialized bytecode containing ICs on systems with
different endianness. Inline caches are recomputed as needed now.
A pleasing side effect of this change is that serialized bytecode is,
on average, a little smaller because fewer atoms are duplicated now.
* Fix more error cases
- fix more cases of missing `sf->cur_pc`.
- use more precise error messages for number conversion methods
- add test cases in test_builtin.js
- updated v8 test results
* Improve consistency of JS_NewFloat64 API
- `JS_NewFloat64()` always creates a `JS_TAG_FLOAT64` value
- internal `js_float64()` always creates a `JS_TAG_FLOAT64` value
- add `js_int64` internal function for consistency
- rename `float_is_int32` as `double_is_int32`
- handle `INT32_MIN` in `double_is_int32`, use (somewhat) faster alternative
- add `js_number(d)` to create a `JS_TAG_FLOAT64` or a `JS_TAG_INT` value
if possible
- add `JS_NewNumber()` API for the same purpose
- use non testing constructor for infinities in `js_atof2`
- always store internal time value as a float64
- merge `JS_NewBigInt64_1` into `JS_NewBigInt64`
- use comparisons instead of `(int32_t)` casts (implementation defined behavior)
* Improve string parsing and JSON parsing
- fix JSON parsing of non ASCII string contents
- more precise string parsing errors
- more precise JSON parsing errors
- add `JS_ParseState::buf_start` to compute line/column
- fix HTML comment detection at start of source code
- improve v8 Failure messages (pulled and modified `formatFailureText` from **mjsunit.js**)
- ignore more v8 tests
* make `Object.prototype` an immutable prototype object
* throw an exception on `Object.setPrototypeOf(Object.prototype, xxx)`
* do not throw an exception for `Reflect.setPrototypeOf(Object.prototype, xxx)`
dlmalloc has been removed and the NDK now exposes a malloc.h header with
malloc_usable_size exposed, so use that.
Also remove the duplication in js__malloc_usable_size.
Fixes: https://github.com/quickjs-ng/quickjs/issues/304
* Improve error handling
- throw RangeError for invalid string length
- throw RangeError for stack overflow with updated message
- fix case for `BigInt` error messages
- refine stack check for `next_token` and `json_next_token`
- throw SyntaxError for too many variables, arguments, parameters...
- v8.js: disable v8 specific tests
- v8.js: disable Realm object tests
- v8.js: disable MODULE tests
- v8.js: disable RegExp static properties tests
- use more precise error messages
- reorder property lookup in `js_obj_to_desc()` according to ECMA
- set global object's [Symbol.toStringTag] to "global"
- fix error message for duplicate parameter name in strict mode