Archive

Posts Tagged ‘r’

How to branch/fork a (StatET) project with SVN

August 14, 2012 1 comment

I was introduced to version control at the 2011 Belgrade R+OSGeo in higher education summer school. I’ve been using it in my daily work ever since.

Recently the need to branch my project came up and this post describes how after a few hours of reading teh internets satisfied my need. In a nutshell, you should prepare your SVN repository to accept branches, branch your local project, dance. Yes, it’s as simple as that.

First, you need to prepare your repository to be able to accept branches. You do that by creating a “branches” directory. Probably the correct location is root of your SVN repository. At least that’s how it works for me.

You now swith to StatET view and “branch” your project. Right-click your project, Team > Branch. Name your branch and check “Start working in the branch”.

If you refresh your SVN repository, you should now see your branched project.

And if you checked the “Start working in this branch” your (local) project should automatically be looking at this branch.

You’re ready to work on your branched/forked project. After you’re done, you can merge this branch with your original project. This aids in casual experimentation without long term consequences. Keep off of drugs, use protection, stay in school!

Advertisements
Categories: Uncategorized Tags: , , , , , ,

Show me yours and I’ll show you mine

August 9, 2012 9 comments

I remember when I started with R, there was little processing power directed toward an IDE. I had enough problems with the syntax, loops and the like and R gui seemed adequate. When I started working on a heavy project, I had to knock it up a notch (bam!). After weeks of trial and error with various IDEs I settled for Eclipse. Year was 2010.
After two years, I feel very comfortable in my IDE of choice but I’ve always felt there’s some things I might be missing. That’s why I’m starting “show me yours and I’ll show you mine” project where I wish to collect workflow setup for working for programming and/or data analysis. The idea is to present your setup and comment on why you think it’s (in)efficient for you. I’ll start!

As mentioned, I use Eclipse with a plugin StatET. Eclipse (and StatET) depend on Java, so you’ll probably have to install either JDK or SDK. This may be a limiting factor for some. Eclipse offers a number of handy keyboard shortcuts (for instance CTRL+r+3 sends line/chunk to R, CTRL+r+s sources the entire file…), manages windows, provides different views and more.

My setup has two code editing windows in the upper left corner, project explorer and task list (kudos to Andrie) on the right. Bottom half holds the R console and R help/tasks. I can easily navigate through files while debugging programs and handy keyboard shortcuts really cut down production time. I like having all windows handy. This is aided by Mylyn Task List plugin that helps you store and switch between individual sets of scripts. More about Mylyn can be found here. I also have a button to run knitr script which produces a pdf report (see previous post). I connect to a SVN server where I store my work. Switching to a SVN look is achieved by clicking the “SVN” icon in the top right corner.

I would encourage anyone interested in sharing their ideas about how to set up their workflow on this blog to send me a screenshot and a short description to my gmail account (romunov) or post about it on their own internet outlet (blog, personal website…) and send a traceback back here.

Read more…

Categories: R Tags: , , , , ,

Write data (frame) to Excel file using R package xlsx

February 12, 2012 25 comments

Writing to Excel files comes up rather often, especially if you’re collaborating with non-OSS users. There are several options, but I like the xlsx package way of doing things. Authors use Java to write to Excel files, which are basically compressed XML files.

Alright, let’s get cracking.

First, let’s create some data.

sample.dataframe

If you don’t have the file created yet, you can just write the data into a new file.

library(xlsx) #load the package
write.xlsx(x = sample.dataframe, file = "test.excelfile.xlsx",
        sheetName = "TestSheet", row.names = FALSE)

If you already have a file created, you can add data to a new sheet, or just add it to the existing one. Here’s how you would add a data.frame to columns D and E (result not shown).

workbook.sheets workbook.test addDataFrame(x = sample.dataframe, sheet = workbook.test,
   row.names = FALSE, startColumn = 4) # write data to sheet starting on line 1, column 4
saveWorkbook(workbook.sheets, "test.excelfile.xlsx") # and of course you need to save it.

You can now open the file in your WYSIWYG editor.

Categories: R Tags: , , , ,

Adding lines or points to an existing barplot

January 15, 2011 5 comments

Sometimes you will need  to add some points to an existing barplot. You might try

par(mfrow = c(1,2))
df <- data.frame(stolpec1 = 10 * runif(10), stolpec2 = 30 * runif(10))
barplot(df$stolpec1)
lines(df$stolpec2/10) #implicitno x = 1:10
points(df$stolpec2/10) 

but you will get a funky looking line/points. It’s a bit squeezed. This happens because bars are not drawn at intervals 1:10, but rather on something else. This “else” can be seen if you save your barplot object. You will notice that it’s a matrix object with one column – these are values that are assumed on x axis. Now you need to feed this to your lines/points function as a value to x argument and you’re all set.


df.bar <- barplot(df$stolpec1)
lines(x = df.bar, y = df$stolpec2/10)
points(x = df.bar, y = df$stolpec2/10)

Another way of plotting this is using plotrix package. The controls are a bit different and it takes some time getting used to it.


library(plotrix)

barp(df$stolpec1, col = "grey70")
lines(df$stolpec2/10)
points(df$stolpec2/10)

 

Categories: R Tags: , , , , , ,

Modeling sound pressure level of a rifle shot

November 1, 2010 3 comments

Noise can be classified as pollution and lawmakers often (always?) treat it as such. Noise can have different origin points, point source being among the simplest to model. Because noise has broader health implications, being able to understand its propagation, a simple model can further our understanding in toning down or preventing excessive noise burden on the environment and its inhabitants. In this work, I will focus on firing range noise and the propagation of sound to the surrounding area.
Small scale firing ranges can be considered as point origin of noise. To make a simple predictive model, a number of assumptions and generalization are made. The reader should realize that this makes the model a bit less realistic.

When talking to experienced people, they will tell you that the distance between a firing range and the first house should be roughly 200 m. While there is no explicit mention of this number in Slovenian laws (yes, I’ve checked), there is a threshold of sound pressure level (SPL) of 75 dB. So, knowing the SPL of the rifle and we know the legal threshold, we can use a simple model to estimate approximate distance at which the SPL will fall to or below the aforementioned legal threshold.

A rifle shot produces a sound pressure level of about 170 dB, which is roughly the sound of a jet engine at a 30 m distance (see here).

Noise propagates and dissipates through the air with roughly (source)

p ~ 1/r

which gives us


L_2 = L_1 - 20 × log(r_2/r_1)

where

L_2 = sound level at  measured distance
L_1 = sound level at reference distance
r_1 = reference distance from source of the sound
r_2 = measured distance from the source

Using this model, we have accepted all sorts of assumptions, like calm weather, even terrain, even air pressure, no air resistance… Come to think of it, this model would be best suited for a desert in lovely weather. Nonetheless, it gives us a starting point.

I would be interested to hear from more knowledgeable readers on any potential mistakes and how to improve the model with regards to at least above assumptions.

Modeling this equation in R is trivial. Let’s write a function that will calculate L_2 for a sequence of r_2 values.


soundPressure <- function(r2, r1, L1) {
 L2 <- L1 - 20 * log(r1/r2)
 dL <- L1 - abs(L1 - L2) # this will give us the appropriate delta that we can use to plot our graph
 return(dL)
}

# let's define some parameters
distance <- seq(1, 1000, 1) # a vector of distances to be used as r_2
L1 <- 170
r1 <- 1

# this is the threshold level defined by the lawmaker
# we're actually interested in finding at what distance, the noise
# dissipates to this level
dB.level <- 75

# apply the above formula to every value in "distance"
dB <- sapply(distance, soundPressure, r1 = r1, L1 = L1)

# plotting
find.x <- which(round(dB) == dB.level)[1] # find which value is ~75 dB

plot(x = distance, y = dB, ylim = c(1, L1), xlab = "Distance (m)",
 ylab = "Sound pressure level (dB)", type = "l")
abline(h = dB.level, col = "red")
abline(v = find.x, col = "red")
# distance label
text(x = distance[find.x], y = 0, offset = 0.5, col = "black",
 pos = 4, labels = paste(distance[find.x], "m"), cex = 1.3)
# SPL
text(x = 0, y = dB.level, col = "black", labels = paste(dB.level, "dB"),
 cex = 1.3, offset = 1, pos = 1)

Result of the plotting is

This tells us that the sound pressure level at roughly 113 m away from the rifle will be 75 dB (the legal threshold). Based on these results, a 200 m buffer around a firing range gives an estimate with a margin of around 100 m buffer.

As already mentioned, I would be happy to hear your comments on errors and how to improve the above model.

Building an R package (under Windows) without C, C++ or FORTRAN code

Why build and R package? It basically boils down to be able to brag at your local pub that a new version of YOUR package is on CRAN as of 7 p.m. CET. But seriously, if you’ve produced some function that other people might benefit (or have ordered them) from using them, like your boss, co-workers or students, consider building a package. The chances of broken dependencies and ease of installing everything outweighs the effort of learning how to build one. If you feel your functions (that may be new in some respect) could benefit an even wider audience, consider submitting it to CRAN (I will not discuss how to do that here, but do read the Ripley reference I mention later).

I have set out to build a test package to prepare myself when the time comes and will really need to build one of my own. This here is an attempt I made to document steps I took when building a dummy package (called texasranger (yes, THE Texas Ranger!)) with one single function. I have attempted to build documentation and all other ladeeda things that are mandatory for the package to check out O.K. when building it.

Before you dig into the actual preparation and building itself, you will need a bunch of tools. These come in a bag with a linux distribution, but you will have to add them yourself if you’re on Windows. This is basically the only thing that is different when trying to build a package on Windows/Linux. I will not go into details regarding these tools (perl, MS html help compiler, if you have C/C++/FORTRAN code you will need GNU compiler set) , a TeX distro), – I will, however, advise you to check out Making R package under Windows (P. Rossi). There, you will find a detailed description (see page 2-6) of how to proceed to get all the correct tools and how to set them up.  When you have done so, you are invited to come back here. Feel free to follow just mentioned tutorial, as it goes a bit more in-depth with explaining various aspects. The author warns that MiKTeX will not work (see the datum of the document), but things might have changed since then and it now works, at least for me.

I have followed the aforementioned Making R package under Windows (by P. Rossi), slides Making an R package made by R.M. Ripley and of course the now famous Writing R Extensions (WRE) by R dev core team (you are referred to this document everywhere). I would advise everyone to read them in this listed order – or at least read WRE last. First two can be read from cover to cover in a few minutes – the last one is a good reference document for those pesky “details”. In my experience, I started to appreciate WRE only after I have read the first two documents.

Enough chit-chat, let’s get cracking!

1. These are the paths I entered (see document by Rossi what this is all about) to enable all the tools so that I can access them from the Command prompt (Command prompt can be found under Accessories, another term for it may be Terminal or Console on different OSs):

c:/rtools/bin;c:/program files/miktex 2.8/miktex/bin;c:/program  files/ghostgum/gsview;C:/strawberry/perl/bin;c:/program  files/r/r-2.11.0/bin;c:/program files/help  workshop%SystemRoot%\system32;%SystemRoot%;%SystemRoot%\System32\Wbem;%SYSTEMROOT%\System32\WindowsPowerShell\v1.0\;C:\strawberry\c\bin;C:\strawberry\perl\site\bin

2. Use R function

package.skeleton()

to create directory structure and some files (DESCRIPTION and NAMESPACE). I used the following arguments:

package.skeleton(name = "texasranger", list =  c("bancaMancaHuman"), namespace = TRUE) #I only have one function, but you can list them more

See argument code_files for an alternative way of telling the function where to read your functions. I suspect this may be very handy if you have each function in a separate file.

3. Fill out DESCRIPTION and NAMESPACE (if you decide to have a name space, read more @ WRE document). Pay special attention to export, import, useDynLib… All of the above mentioned documentation will help guide you through the process with minimal effort.
A side (but important) note. You should write your functions without them calling

require()

or

source()

to dig up other function and packages. Read more about NAMESPACE and how to specify which functions and packages to “export” (or “import”) and how.

4. Create  documentation files. This is said to be the trickiest part. I still don’t have much experience with this so I can’t judge how tricky it can be – but I can tell you that it may be time consuming. Make sure you take time to document your functions well. If you were smart, you wrote all this down while you were writing (or preparing to write) a function and this should be more or less a session of copy-paste. Use

prompt(function, file = "filename.Rd")

to create template Rd files ready for editing. They are more or less self explanatory (with plenty of instructions). It help if you know LaTeX, but not necessary. Also, I suspect the function may dump the files into the correct /man directory automatically – if it doesn’t, do give it a hand and move the files there yourself. Perhaps worth mentioning is that if you want to reference to functions outside your package, use(notice the options square brackets [])

\code{\link[package:function]{function}}

, e.g.

\code{\link[raster:polygonsToRaster]{polygonsToRaster}}

or

\code{\link[utils:remove.packages]{remove_packages}}

– To refer to “internal” package function (those visible by the user), use

\code{\link{function_name}}

4a. If you have datasets you wish to include in your package (assuming those in library(help=”datasets”) are not sufficient), you will need to do two things. First, prepare your object (list, data.frame, matrix…). Save it and prepare documentation. Saved .rda file goes to data/ directory. The documentation file goes into the same directory (man/) as other .Rd files. If your dataset is not bigger than 1 MB you shouldn’t worry, otherwise consult the Manual on how to prepare a

save(my.dataset, file = "my.dataset.rda") # move to data/ folder
promptData(my.dataset, filename ="my.dataset.rda.Rd") # move to man/ folder¸and edit

4b. You should also build a vignette, where you can explain at greater length what your package is about and maybe give a detailed (or more detailed) workflow with the accompanying functions. You can use Sweave or knitr, and the folder to place your .Rnw file is vignettes/.

5. To check the documentation for errors, use

R CMD Rd2txt filename.Rd

and/or

R CMD Rdconv -t=html -o=filename.html filename.Rd

6. Next, you should run a check on your package before you build it. You should run it from the directory where the package directory is located. I’ve dumped my package contents to d:/workspace/texasranger/ and executed the commands from d:/workspace/

R CMD check

If you get any errors, you will be referred to the output file. READ and UNDERSTAND it.

7. Build the package with the command

R CMD build package_name

This will create a file and will add a version (as specified in the DESCRIPTION file, i.e. package_name_1.0-1.tar.gz, see WRE for specifics on package version directives).

package_name is actually the name of the directory (which should be the name of your package as well).

If you use Windows, you can build a .zip file AND install the package (uses install.packages) at the same time. Use command

R CMD INSTALL --build package_name_1.0-1.tar.gz

8. Rejoyce.