An Average Programmer's Quest to Learn Rust - Part 1: Searching Text

Learning rust while having fun

March 14, 2022 - 5 minute read -

I am tired after years of programming in languages like python, javascript/node, ruby, etc. As my programs got complicated, I felt these languages pulled me back. So I started researching if there are better languages. I came across Clojure and Rust. Though I love Clojure, there’s a mysterious force pushing me to explore Rust.

I am not one of those smart programmers. Rust is overwhelming! But, then the mysterious force isn’t allowing me to give up. So I started reading Rust in Action. Later, I thougt why not take notes and share it with fellow explorers.

I am writing a program to search a given string in a pile of text. I want the out put to show me the exact line with the string and the one above and below. Just to get the context.

Starting a new project was pretty simple

cargo new search-a-string

Here’s the plan

  1. Go through all the lines and see if the line has a string.
  2. If the line has the string then tag that line by storing it in an array.
  3. Later, loop through the text again see if it’s above or below the marked line.
  4. Print those lines to provide context.

let query = "Rust"; 
let text = "\
    Rust is a multi-paradigm, 
    general-purpose programming
    language designed for performance and safety, especially safe concurrency. 
    Rust is syntactically similar to C++,
    but can guarantee memory safety 
    by using a borrow checker to 
    validate references.";

I read variables are by default immutable (can’t be changed) in Rust. But I don’t understand, why it’s still called variables?

Now, I need an array that can store line numbers of the text with the query string.

let mut tags // line not complete yet

“mut” is to tell Rust that it’s a real variable and we should be allow change it.

This array needs to be of dynamic length or in other words a vector.

let mut tags: Vec // line not complete yet

Now, what would be the type of elements inside the vector. In this case line numbers (ex: 1, 4, 1000). So basically positive integers or technically unsigned integer.

let mut tags: Vec<usize> // line not complete yet

Finally we need to initalise it, you are familar with “new …()” in other languages.

let mut tags: Vec<usize> = vec![]; 

“vec![]” is shorthand for “Vec::new()”.

Yea, too many stuff. Because unlike other languages Rust don’t want to make assumptions. Giving it clarity on things, helps Rust detect issues early and optimize it for performance.

Finally we need a vector to store the context lines.

let mut context: Vec //line not complete yet

For each match, we will need a vector with three elements (the line above, the line below and the line itself) along with line numbers. Basically, we are creating a vector inside a vector.

let mut context: Vec<Vec<(usize, String)>> = vec![];

Here’s a visualization of the above line to help you understand. “usize” is for the line no, “String” is to store the line.

Alright, let’s iterate through the lines as tag the once that contains the query string.

for (i, line) in text.lines().enumerate() {  
 if line.contains(query) {
 tags.push(i);
  			
 let v = Vec::with_capacity(3); 
 context.push(v);
 }
}

If there’s no match we want to stop the program

if tags.is_empty() {                            
 println!("Unable to find {}", query);
 return;
}

Now let’s perform another iteration to store the context.

   for (i, line) in text.lines().enumerate() {  
      for (j, tag) in tags.iter().enumerate() {
        let lower_bound =
            tag.saturating_sub(1);           
        let upper_bound =
            tag + 1;
  
        if (i >= lower_bound) && (i <= upper_bound) {
            let line_string = String::from(line); 
            let context_line = (i, line_string);
            context[j].push(context_line);
        }
      }
    }

What’s this?

let lower_bound = tag.saturating_sub(1);  

Apparently CPUs have some issues with negative values, so we are ensuring resulting number don’t go below 0.

And this?

let line_string = String::from(line);

We cannot use “line” directly because it’s a reference, we need to the above line to copy the value and make it storable as a string.

One last loop to print the context.

for context_line in context.iter() {
  for &(i, ref line) in context_line.iter() {      
  let line_num = i + 1;
  println!("{}: {}", line_num, line);
  }
}

What’s this?

for &(i, ref line) 

Yea I too found it wierd, but I will solve the puzzle in an upcoming post and explain.

Let’s test it out.

cargo run

Here’s the output.

1: Rust is a multi-paradigm, 
2:     general-purpose programming
3:     language designed for performance and safety, especially safe concurrency. 
4:     Rust is syntactically similar to C++,
5:     but can guarantee memory safety 

You can find the full source code here.

I am creating a small community for curios and audicious people who want to learn difficult technologies like Rust. Join us here, so that you won’t miss out on the upcoming posts.