-
Notifications
You must be signed in to change notification settings - Fork 171
uint8_t parsing #349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uint8_t parsing #349
Conversation
|
first implementation.
I have added a separate benchmark which I will merge to the main benchmark later. Apple M1 (8) @ 3.20 GHz using clang (arm64) Apple M1 (8) @ 3.20 GHz using gcc (arm64) Intel(R) Core(TM) i7-5500U (4) @ 3.00 GHz using clang (x86_64) Intel(R) Core(TM) i7-5500U (4) @ 3.00 GHz using gcc (x86_64) |
|
Request for review @lemire |
|
A constexpr branch should be efficient and enough for handling |
|
I recommend adopting your benchmark immediately: #350 It does not use your SWAR approach, we just benchmark fast_float vs the standard. For the memcpy, it is not usable in a constexpr context, but bit_cast is. |
Alright. I will rebase this branch after the PR gets merged.
Ah, we only choose the branch at compile time. I should rather do: About the memcpy UB, I cannot find a fast (and generally-applicable) way to fix this. Adding a branch on Do we document a precondition: atleast 4 readable bytes from |
I think that this should be if constexpr (is_uint8) {although some care is needed not to break backward compatibility with earllier versions of C++.
No. We don't do that. We can get creative, but we don't want to read beyond the buffer in general as this might cause a fatal crash in some instances. |
|
There is an issue with the benchmarking library. It is adding too much overhead(due to templating I think). This is inaccurate compared to the actual raw speed. I wrote a simple benchmark which just measures the throughput: |
|
@shikharish Please see #351 |
Signed-off-by: Shikhar <[email protected]>
Signed-off-by: Shikhar <[email protected]>
Signed-off-by: Shikhar <[email protected]>
|
rebased. @lemire |
|
@shikharish Impressive... main: this PR I am not sure how we are beating Your code looks very good to me, but I'd want more tests before merging this. Thus, please see shikharish#1 (The test do pass, but I'd like you to review the test and make sure you feel that we have enough tests.) |
adding some ipv4 test
Signed-off-by: Shikhar <[email protected]>
|
Merged. Will be in next release. I will blog about it too (with credit to you). |
|
@lemire Thanks a lot! This was a great learning experience for me. I want to work on some more issues/projects. Is there anything you have in mind I can work on? I have been looking at other projects(simdutf,simdjson,etc.) so anything in that general space would be exciting. Honestly, I am only starting to get into projects focusing on performance and latency(I've realized I love fast code), so anything would be a learning experience. Either way, thanks again. Looking forward to the next release(and the blog post!). |
|
@shikharish I have no secrets. Because this seems to greatly speed up the common case when parsing IPv4 addresses, it seems that taking this code into ada-url/ada would be a low hanging fruit. IPv4 can come in different forms but we might be able to do optimistic parsing followed by a fallback. The ada library is used by millions of people. Now we don’t want to bring fast_float into this… but the code is simple enough. |
Related to #226.