-
-
Notifications
You must be signed in to change notification settings - Fork 229
Handling null vs empty strings #114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm not sure this is a good idea. Semantically, an empty field and an empty quoted field are exactly the same in the underlying CSV data. For example, in order to add this feature, you'd need to thread changes all the way down into It's not clear why you want this either. Can you provide real world examples? |
In the example above I get a The use case is simply that I'm running SQL queries against CSV files using predicates like I admit I haven't read a CSV spec in decades ... so maybe this is just how CSV works? |
I just read this blog post about how Spark handles this and it states that empty strings and missing strings are equivalent in CSV, so I guess we can close this issue. I suppose I can add an option in my project to treat empty string as null or not. Thanks for the help. |
What you can do is define a custom deserialize with function for serde that converts empty stringa to None values, so that you don't need to spread the case analysis out. |
Nice, I'll try that. Thanks again. |
Given this input file, I would like the c_string value in row 2 to be returned as
None
and the c_string value in row 3 to be returned asSome("")
. This would be more consistent with how other types are handled.Is this desirable behavior? If so, I'd like to try and add this feature.
The text was updated successfully, but these errors were encountered: