"In the world ye shall have tribulation: but be of good cheer; I have overcome the world." –John 16:33

San Pedro Garza Garcia

Tag: SAS

How to teach computer programming to kids (Part 7) Statistical programming in R and SAS

Day 613 of 1000

This is the seventh in a series of posts on how we taught our children to program, what we did wrong and how we think we could have done better.  You can see the introductory post and index to the series by clicking here.

[Previous post]
[Next post]

A Beginner's Guide to RKelly did not start learning to program until she was already in college.  The very first pass at this was when our friend Troy helped her in his work at the Biological and Agricultural Engineering laboratory at North Carolina State University.  We got her a book on R programming.  She spent about three weeks to work her way through four or five chapters of the book before she spent about three weeks in the lab with Troy.  I did not sit with Kelly at all during that process and even though she learned a lot, she felt pretty badly that she contributed so little.  That was my first clue that I had failed terribly by not teaching Kelly to program.

Nevertheless, there was a lot of programming good that came out of Kelly’s six or so weeks in the R Language.  Kelly is a Senior in Statistics at NCSU, so it was a very good thing that R was her first programming language.  Most programmers start with variables that hold a single piece of data and build bigger structures to hold sets of data.  Statistical languages start with whole data sets that allow the programmer to calculate measures of the data set and present the data in user friendly formats.  There is a big difference in paradigms between statistical programming and the more traditional language like C/C++/C#, Java, and Python.

This semester Kelly took her first two programming classes:  SAS (statistical) programming and Java.  The difference between her preparedness for the two classes was stark.  She was way out in front of everyone in her statistical programming class because she had learned to think of programming in terms of how to manipulate entire data sets where looping is often internal.  In Java, not only did she have to learn how to think of data points rather than data sets, she also needed to create loops to work through all the points in the sets.  Her meager six weeks of R experience paid huge dividends in her statistics class and she excelled there.  It was a different story with the Java class.

Kelly struggled and needed to spend many more hours during the first part of semester just to stay even in Java.  By the middle of the semester, she had everything under control, but she did not start “getting” the material until the last few weeks of the semester.  She kept her grade up until that light went on with memorization and hard work.  It was frustrating to her and so unnecessary.  If we would have done the same thing for Java as we had done for R by giving her some training before she got to the class, her experience would have been completely different.  She gets the material now, but there could have been a good deal more joy and a deeper understanding of Java had I given her even a minimal amount of training–six weeks would have done it.

Why NCSU is a great school to study Statistics

Day 569 of 1000

It is tough to get a job these days.  I feel sorry for kids in University who need a summer internship or a job when they graduate.  Christian plans to go to grad school and has already had an internship so it does not affect him so much.  Kelly, on the other hand, wants an internship this summer, so she went to two job fairs at NCSU to get leads.  She got six interviews.  She has received two job offers so far, but turned down one of them because it was not a good match.  She is one of two finalists for a third position and has not heard from a fourth.  In this market, that is a pretty amazing record.  I think the reason she received so many offers when others did not is because she studies Statistics.  It seems like there are a lot more jobs available to engineers, but there are also a lot more people chasing those jobs.  For each job that requires a statistician, there are way fewer people with the skills to do the work. 

The other thing is that NCSU uniquely trains their Statistics majors in the use of commonly used industry tools.  For example, Kelly has a class that teaches her how to perform statistical programming.  The programming environment they use is SAS which is expensive enough that individual students cannot afford to it.  The reason it is available to NCSU is that SAS started at NCSU and still has a close affiliation with the school.  At they end of the class, she should have learned everything necessary to get her first SAS certification.  The class even offers them the opportunity to take the certification test at a discount rate.  The students use SAS and R, normal industry tools, to do their homework in other classes, too.  The expectation is that the students will be able to walk into a new job and contribute the first day.

An ancillary benefit to the SAS training is the ability to talk about the use of these tools effectively in an interview.  I think this was huge in her last interview with one of the research labs at Johns Hopkins.  Kelly could explain in detail how she would accomplish specific tasks such as data cleaning and analysis.

Powered by WordPress & Theme by Anders Norén