Skip to content

Use and_modify in the docs for hash map instead #3582

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

ryanwhitehouse
Copy link

This PR updates the rust book to suggest and_modify.

Currently the rust book suggests using the following

use std::collections::HashMap;

    let text = "hello world wonderful world";

    let mut map = HashMap::new();

    for word in text.split_whitespace() {
        let count = map.entry(word).or_insert(0);
        *count += 1;
    }

    println!("{:?}", map);

https://doc.rust-lang.org/book/ch08-03-hash-maps.html?highlight=hashmap#creating-a-new-hash-map

The docs for hashmap use and_modify https://doc.rust-lang.org/std/collections/hash_map/enum.Entry.html

use std::collections::HashMap;

let mut map: HashMap<&str, u32> = HashMap::new();

map.entry("poneyland")
   .and_modify(|e| { *e += 1 })
   .or_insert(42);
assert_eq!(map["poneyland"], 42);

map.entry("poneyland")
   .and_modify(|e| { *e += 1 })
   .or_insert(42);
assert_eq!(map["poneyland"], 43);

@ryanwhitehouse
Copy link
Author

Similar to this change, however that only affected the docs and not the book rust-lang/rust#98122

Comment on lines -10 to +12
let count = map.entry(word).or_insert(0);
*count += 1;
map.entry(word)
.and_modify(|existing_value| *existing_value += 1)
.or_insert(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is necessarily better than the existing snippet. The increase in the number of characters suggests that this is a slightly more complicated method than the previous one. It introduces a new method and_modify (which wasn't needed before) and makes it irrelevant that the or_insert returns something.

Due to that, the description of what happened in the code is now completely out of sync:

The split_whitespace method returns (...). The or_insert method returns a mutable reference (&mut V) to the value for the specified key. Here we store that mutable reference in the count variable, so in order to assign to that value, we must first dereference count using the asterisk (*). The mutable reference goes out of scope at the end of the for loop, so all of these changes are safe and allowed by the borrowing rules.

So you'd have to explain and_modify now, and you failed to show that or_insert returns something and how it can be used.

In contrast, currently the book uses or_insert already in the previous section Adding a Key and Value Only If a Key Isn’t Present and hints that:

The or_insert method on Entry is defined to return a mutable reference to the value for the corresponding Entry key if that key exists, and if not, inserts the parameter as the new value for this key and returns a mutable reference to the new value.

This is also not shown in that section, but I was happy to see it in the code in the following section when reading this part of book.


I personally don't think the pattern you introduced is better in this particular case also for another reason: the increment value 1 is duplicated in and_modify and or_insert. With the literal 1, this is not that of an issue, but it would be if the increment was more complicated. Consider this:

        map.entry(word)
            .and_modify(|existing_value| *existing_value += word.len())
            .or_insert(word.len());

whereas with the existing approach:

        let count = map.entry(word).or_insert(0);
        *count += word.len();

And to me, the existing pattern is also simpler and easier to follow - first, you read the existing value in the hash map or initialize it, and then you care about updating it.

Copy link
Contributor

@generalmimon generalmimon Mar 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And to me, the existing pattern is also simpler and easier to follow - first, you read the existing value in the hash map or initialize it, and then you care about updating it.

This is pretty much what @SkiFire13 wrote in rust-lang/rust#98122 (comment) too:

It just seems more linear and thus simplier to me:

  • if an entry is empty the default value is 0 (this is in addition to what normally happens)
  • then increase it (this always happens)

In contrast with and_modify:

  • if there's an entry increase it (which is conditional on not being the first element)
  • else set it to 1 (which is condition to being the first element)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes sense. Especially if you have to repeat word.len() as in your example above.
I am happy to close this issue. I think and_modify should be introduced as an alternative as it will be better in some cases, but this is probably not the way to do it. It would probably require a note or it's own section which introduced an example whereby and_modify was much cleaner.

Thanks for your review!

@generalmimon
Copy link
Contributor

@ryanwhitehouse:

The docs for hashmap use and_modify https://doc.rust-lang.org/std/collections/hash_map/enum.Entry.html

Well, this is pretty much a non-argument, because they show the way with just or_insert too (https://doc.rust-lang.org/std/collections/struct.HashMap.html#examples):

// update a key, guarding against the key possibly not being set
let stat = player_stats.entry("attack").or_insert(100);
*stat += random_stat_buff();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants