Seeding a Database in Ruby on Rails

Ruby on Rails has excellent tools for seeding a database, and thanks to the work of the community, some gems make this task easier.
Apart from seeding a database, we have helpful tools to check the database and better organize the data seeds.
Creating a sample application
Let’s start by creating a new application:
rails new samplecd sample
Creating a model
Next, generate a new model. If you are curious, by typing rails generate
(or the rails g
shortcut), you will see all the available generators.
bin/rails g model Movie title director storyline:text watched_on:date
Here you are setting the title
and director
as strings (the default type if not specified), storyline
as text, and watched_on
as date (when setting dates, not datetimes, the convention is to append on
to the name).
Rails will generate a migration for you adapted to the default database, which is SQLite. Migrations are saved in db/migrate Let’s see how it looks like:
class CreateMovies < ActiveRecord::Migration[7.1] def change create_table :movies do |t| t.string :title t.string :director t.text :storyline t.date :watched_on
t.timestamps end endend
As you can see, Rails adds the version you are using in square brackets at the end of the parent class.
The timestamps statement will generate the created_at
and updated_at
fields automatically.
Let’s run it:
bin/rails db:migrate
Now Rails has actually created the table. Just in case you did something wrong, you can always rollback:
bin/rails db:rollback
This command accepts an optional step
parameter to go back as many migrations as needed. For example, if you want to undo 2 migrations, you can do it like this:
bin/rails db:rollback STEP=2
Let’s see how the schema looks like after running the migration:
ActiveRecord::Schema[7.1].define(version: 2024_08_04_102643) do create_table "movies", force: :cascade do |t| t.string "title" t.string "director" t.text "storyline" t.date "watched_on" t.datetime "created_at", null: false t.datetime "updated_at", null: false endend
This file will contain the entire database schema as we run more migrations.
Rails commands
By using the -T
parameter you can see a list of the available Rails commands:
bin/rails -T
You can even filter by namespace, such as db
:
bin/rails -T db
Creating some seeds
Let’s get to the interesting part of this article. Open the seeds file, and paste this:
Movie.destroy_all
Movie.create!([{ title: "Inside Out", director: "Pete Docter", storyline: "As Riley’s family moves to a new city, her emotions —Joy, Sadness, Anger, Fear, and Disgust— navigate the challenges of adjusting to a new environment.", watched_on: 9.years.ago},{ title: "Toy Story 4", director: "Josh Cooley", storyline: "Woody, Buzz Lightyear and the rest of the gang embark on a road trip with Bonnie and a new toy named Forky. The adventurous journey turns into an unexpected reunion as Woody's slight detour leads him to his long-lost friend Bo Peep.", watched_on: 5.years.ago},{ title: "Soul", director: "Pete Docter", storyline: "After landing the gig of a lifetime, a New York jazz pianist suddenly finds himself trapped in a strange land between Earth and the afterlife.", watched_on: 4.years.ago}])
p "Created #{Movie.count} movies"
First, you destroy all movies to have a clean state and add three movies passing an array to the create
method. Those handy Ruby X.ago
statements to define dates are provided by the Active Support Core Extensions.
In the end, there’s some feedback about the total movies created. Let’s run it!
bin/rails db:seed# "Created 3 movies"
You can execute this command as many times as you need, since existing records are deleted thanks to the first line containing the destroy
statement.
To check them, you can use rails runner
:
bin/rails runner "p Movie.pluck :title"# ["Inside Out", "Toy Story 4", "Soul"]
Using a custom Rails task to seed actual data
All your seeds are supposed to be development data, not actual data for production use. So, don’t seed in production the way you just did! Mainly because the first step deletes all movies!
To seed actual data, it is best to create a custom Rails task. Let’s generate one to add genres.
First generate the model and then migrate the database. Next, create the task.
bin/rails g model Genre name
bin/rails db:migrate
bin/rails g task movies seed_genres
This command creates a movies.rake file in the lib/tasks directory containing the seed_genres
task.
Paste this code into the new rake file:
namespace :movies do desc "Seeds genres" task seed_genres: :environment do Genre.create!([{ name: "Action" }, { name: "Sci-Fi" }, { name: "Adventure" }])
p "Created #{Genre.count} genres" endend
It’s now listed in the Rails commands list:
bin/rails -T movies
Time to run it!
bin/rails movies:seed_genres# "Created 3 genres"
Loading seeds using the console
The console is handy for playing with your data. Let’s open it:
bin/rails c
Did you know that you can load and access your seeds from the inside? Try this:
Rails.application.load_seed
Playing with data using the console sandbox
Sometimes you will need to run destructive commands on real data in your development or production environment without making the changes permanent. It’s kind of like a safe mode where you can do whatever you want and then go to a previous state.
This mode is called sandbox, and you can access it with the command bin/rails c --sandbox
bin/rails c --sandbox
This technique is handy for debugging a real database, such as when a user says they are trying to update their profile name and see a weird error. You could reproduce that error directly using the sandbox mode without affecting the actual data.
Loading more seeds using Faker
If you need, for example, 100 movies, you can replace your seeds file with this:
Movie.destroy_all
100.times do |index| Movie.create!(title: "Title #{index}", director: "Director #{index}", storyline: "Storyline #{index}", watched_on: index.days.ago)end
p "Created #{Movie.count} movies"
Now run the seed
task:
bin/rails db:seed# "Created 100 movies"
But the result does not look realistic at all:
bin/rails runner "p Movie.select(:title, :director, :storyline).last"#<Movie id: nil, title: "Title 99", director: "Director 99", storyline: "Storyline 99">
Time to use Faker, a gem that generates random fake values. Add it into the development group in your Gemfile:
bundle add faker --group development
Run bundle install
and replace your seeds file with this:
Movie.destroy_all
100.times do |index| Movie.create!(title: Faker::Movie.title, director: Faker::Name.name, storyline: Faker::Lorem.paragraph, watched_on: Faker::Time.between(from: 4.months.ago, to: 1.week.ago))end
p "Created #{Movie.count} movies"
Check it out again:
bin/rails db:seed# "Created 100 movies"bin/rails runner "p Movie.select(:title, :director, :storyline, :watched_on).last"#<Movie title: "The Big Lebowski", director: "Alfonso Kertzmann", storyline: "Sit ea aut. Iure iusto deserunt. Ratione autem aut...", watched_on: "2024-05-24", id: nil>
Much better!
Test your knowledge
-
What is the primary purpose of the db/seeds.rb file in a Ruby on Rails application?
-
How do you run the seeds file to populate the database with seed data?
-
Why is it not recommended to seed production data using the db/seeds.rb file?
-
What is the purpose of creating a custom Rails task for seeding actual data?
-
What is the advantage of using the Rails console in sandbox mode?
Conclusion
Seeding the database while developing the application is essential, as it will have the feeling of working with actual data.
Also, knowing the tools available for working with seeds is good for your productiveness, so it is worth investing time in learning them.
If you are new to Ruby on Rails, I would like to recommend a course I created on LinkedIn Learning where I explain the basics of the framework from a practical point of view. You will learn the basics of Active Record, the Scaffold generator, the asset pipeline, ERB code, working with images with Active Storage, reactive code with Hotwire, and much more.