Anaylzing with PostgresSQL

Yesterday’s daily project was to analyze data with PostgreSQL using complex statements.

First, we needed to download the sample dataset and import the data into our new database from Github that our instructor provided. Here was my first mistake–never click the “download” button on GitHub if the file size is more than 50mb. My browser kept crashing until I realized I needed to right-click the download button and simply save the link as a file into my computer.

With that first mistake out of the way, I went to the terminal and created a folder called “AnalyzeSQL” where I created a database called votes using this code:

createdb votes
psql votes
psql votes < voterdb.sql

To see how many tables I’m working with, I typed in \d and received this:

Screen Shot 2017-09-01 at 8.41.04 PM

Now that I know what tables to select, I needed to know what columns from each of these tables I’m working with.

select * from issues;

Screen Shot 2017-09-01 at 8.45.42 PM

select * from states limit 5;

Screen Shot 2017-09-01 at 8.46.42 PM

select * from cities limit 5;

Screen Shot 2017-09-01 at 8.47.28 PM

select * from votes limit 5;

Screen Shot 2017-09-01 at 8.48.24 PM

Now, it’s time to analyze the data I was given and needed to answer the following six questions:

Were there more male or female voters in Greensboro?
select
gender, count(gender) from votes join
cities on cities.id=votes."cityId" where
name='Greensboro' group by
votes.gender;

Here is the output:

Screen Shot 2017-09-01 at 9.01.25 PM


What is the average age of voters in Orlando?

select
avg(age) from cities join
votes on cities.id=votes."cityId" where
name='Orlando';

Here is the output:

Screen Shot 2017-09-01 at 9.03.04 PM

Which state have the most cities listed?
select
states.name, count(cities.name) from
states join cities on states.id=cities."stateId" group by
states.name order by
count(cities.name) desc limit(1);

Here is the output:

Screen Shot 2017-09-01 at 9.04.12 PM

Which state had the most votes placed?
select
states.name, count(votes.id) from
states join cities on states.id=cities."stateId" join
votes on votes."cityId"=cities.id group by
states.name order by
count(votes.id) desc limit 1;

Here is the output:

Screen Shot 2017-09-01 at 9.06.57 PM

What percentage of cities passed the minimum wage issue (> 50% of votes)?
select
count(*) from (select count(in_favor), cities.name from
votes join cities on votes."cityId"=cities.id join
issues on issues.id=votes."issueId" where
in_favor='t' and "issueId"=2 group by
cities.name having count(in_favor) > 100) as
totally_in_favor;

Here is the output:

Screen Shot 2017-09-01 at 9.07.56 PM

How many of the issues were approved in each of the following cities? An issue is ‘approved’ if it got more than 50% of the votes. The following states are Dallas, Atlanta, Anaheim and Boston. 
select
issues.id, cities.name from votes join
cities on votes."cityId" = cities.id join
issues on issues.id = votes."issueId" where in_favor='t' and cities.name similar to 'Dallas%|Atlanta%|Anaheim%|Boston%' group by cities.name, issues.id having count(in_favor) > 100;

Here is the output:

Screen Shot 2017-09-01 at 9.08.36 PM.png

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Powered by WordPress.com.

Up ↑

%d bloggers like this: