web developer | musician | venue booker

CLI Data Gem Project

CLI Data Gem Project

For this project, I decided to create a CLI Gem that would scrape vintagebassworld.com, to give the user data about many different vintage bass guitars. Since I am a bassist, I found the project to actually be super informative and actually really helpful! 

My original approach to coding this was based on Avi's video, CLI Gem Walkthrough. At first, everything was going great, but because I was perhaps a little too ambitious about how I wanted the data to be structured, I hit a wall pretty hard. I was assuming it was going to be easy to be able to search by Brand, Model, or Year. That proved to be very difficult, because searching by year would have to be a massive web scrape of over 100 websites within the vintagebassworld.com domain. So after a full day of coding, I decided to scrap everything and start over with a simpler, more top-down approach.

After watching Avi's second video, https://www.youtube.com/watch?v=Y5X6NRQi0bU, I decided to model my 2nd attempt after this structure (with some of the original structure as well). This worked out well, and was much simpler for me to see the data flow, as it was more in line with the code I've been working with in previous Flatiron lessons. My top-down approach made sense too. The user is presented a list of 5 brand to choose from (this was hard-coded. I didn't see it necessary to scrape the website for the brand names). Once the user makes their selection, a new instance of the Scrapermethod is created, which in turn creates a new instance of the of the Brand class. Once that happens, all the other Scraper methods are able to do their thing. The #scrape method is able to return a list of all the available models of that particular brand, and at the same time, make new instances the Model class, and add those newly created classes to the Brand instance. The user chooses which model they are interested in, from there the #scrape_instruments method is called, which returns a list of every year that particular brand was manufactured. It also creates a new instance of the Instrument class, and adds each instance to the Modelsobject.

Once the user chooses which particular year they are interested in (or, as it is defined within the code, Instrument, the scrape_description method is called, returning the description, as well as adding the description to the instance of Instrument. 

I had some major challenges doing the scraping, because the website I was scraping from is old, and has all sorts of weird drop-down menus, and menus within menus and all that confusing stuff. Also, the site was not coded with specific CSS class names. At best there were 3 or 4 CSS classes to start from. Eventually I determined the best way to get the data was via XPATH. Not ideal I guess, but it did work. After a few refinements, I was able to get data via iteration. One big challenge I had was string interpolation within an XPATH. When finding XPATH links on Chrome, quotes for CSS classes are single quotes. It took me some time to realize why my string interpolations weren't working: Because the #{}syntax only works with double quotes!!! I also learned about escape sequences within a string. Like, when you need to actually have a quote inside a string. So I learned about ` and all that good stuff. 

All in all, a great project! I was great to go out on my own and learn how to do things without the option of asking a question on learn.co. All that trial and error throughout the process, as well as all the research, is what it's really all about.