Using PHP and R to create SVG Graphs for Web

September 18th, 2011 No comments

You can use the statistical package R to create sophisticated, data-driven SVG graphics for browser applications, particularly when you can choose the browser the client is working on.

Since R wants to write a file to the file system, create a named pipe in php and pass that value to the R script, and call R from the command line (your install location may vary):

<?php
/**
 * r_example
 *
 * program calls R script, and displays output to web.
 *
 */
//to get graphics output from R, we need to create a temporary named pipe.

//create random temp file name
$pipeFile = "/tmp/R-" . mt_rand() . ".svg";
posix_mkfifo($pipeFile, 0600);

//open the pipe for reading, so R has something to write to
$svgPipe = fopen($pipeFile, 'r+');
//stream_set_blocking($svgPipe, TRUE);

$output = shell_exec("/usr/bin/R -f svg.R --args " . $pipeFile);
$xmlTextFromR = "";

//read the pipe, and add it to a variable
while($input = trim(fgets($svgPipe))) {
	stream_set_blocking($svgPipe, FALSE);
	$xmlTextFromR = $xmlTextFromR . $input;
}

//close the pipe
if(file_exists($pipeFile)) {
	if(!unlink($pipeFile)) {
		die("unable to remove pipe");
	}
}

In R, we are going to simply take the name for the pipe, and write an svg graph to the pipe.

# svg.R
# creates a simple graph and exports to the file passed as the first argument
# Author: sgrizzard
###############################################################################
# send errors to stdout
sink(type="message")
args = commandArgs(TRUE) #Read user-specific arguments
args[1]

# open svg library
library(RSvgDevice)
args[1] = gsub('^[[:space:]]+', '', args[1]) #remove space
devSVG(args[1])
cars <- c(1, 3, 6, 4, 9)
plot(cars)
dev.off()

q()

Back in PHP, we now have the XML in a text variable, which we can simply echo to the browser.  Or, we can load that XML into a DOM object, and manipulate it.

$xmlFromR = DOMDocument::loadXML($xmlTextFromR);
if ($xmlFromR == FALSE) {
	throw new Exception('error loading xml from buffer');
}

//add some text at the top of the svg object to show off
$titleTextNode = $xmlFromR->createElement("text", "some text");
$titleTextNode->setAttribute("style", "font-size:10");
$titleTextNode->setAttribute("transform", "translate(10, 10)");
$svgParentNode = $xmlFromR->getElementsByTagName("svg")->item(0);

$svgParentNode->appendChild($titleTextNode);

//send svg content header, and output the svg to the browser.
header('Content-Type: image/svg+xml;charset=UTF-8');
echo $xmlFromR->saveXML();

?>

Since we used the RSvgDevice package in the R script, we need to install it.  Start R as the web user (or root), and run:

install.packages("RSvgDevice", repos="http://cran.stat.ucla.edu/")

If all goes well, you should see the following in your web browser:

 

Categories: Programming Tags:

10 IT Projects/Add-ons that are Always worth Doing

July 8th, 2009 No comments

If you work with enough IT people, you know that each of us have our “list” of things we are always trying to sell.  Me… I usually push for some form of version control, and everyone else thinks I am crazy.  So, it is prudent to take a look at what your IT consultant is proposing, to see if you really need that IT project, or if the project will cause more long-term support headaches than it is worth.  But, when most IT people are pressed for justification, they respond with a string of “techno-jargon”, and you say, “Okay… do it,” just to make them stop.

However, there are some IT projects that are almost always worth doing.  I have made a list of the top ten, and I will attempt to explain the techno-speech attached to each justification.

  1. Network Documentation – This is absolutely essential; without documentation, you are at the mercy of your IT person’s memory.  This documentation should, at a minimum, include a list of the servers, firewalls, and routers in your office, a list of the services they offer, and a print-out of the configurations of each service.  It should also include a logical schematic of the network, which is a drawing that shows how each computer is connected to the others, and where the routers and firewalls are.  Most importantly, it should include a “one-page” cover sheet with a list of “current issues” and an overview of the network that someone can read and “jump right in”.
  2. IT Policies and Procedures – Even if you are a “one person office” with no servers and one desktop, you should sit down with an IT consultant and discuss what you should and should not do on the system, and write it down.  Topics covered should include what types of information (especially confidential information) should be sent through email, how the computer should be updated and how often, what types of software you should allow users (including you) install on the machines, what anti-malware protection you are using and how often you should update it, and what the backup and restore procedures are for your data.  You should also go over the common ways of “getting infected with malware”, and establish policies to mitigate these.  A larger company should include hiring and “un-hiring” procedures with regards to accounts and passwords, and have documents for new hires to sign.
  3. Mirrored RAID – The idea is to have multiple hard drives in a system, and let them “copy” each other, so if one hard drive fails, the system keeps running and no data is lost.  The two mechanical parts in most computers, the fans and the hard drives, are the two “physical” devices that tend to fail, and while your computer can usually handle a failed fan intelligently (they usually shut off when the processor overheats), a failed hard drive will put your system out of commission and destroy some of your data.  By installing multiple hard drives, and having them “mirror” each other, you move a problem from “two-week disaster” to “nuisance”.  The details of how you do it (hardware vs software) are less important; any mirrored RAID system is better than no mirrored RAID system.  I recommend you installing mirrored drives on every server, and on any computer that “is essential” to your business.  Most server software supports RIAD out of the box, but setting up clients with RAID is something you need to ask for when purchasing new computers, and it can be a pain to do.  Note: RAID is not the same as back-up – you need to do both.  RAID protects you against physical hard drive failure, not against other forms of disaster such as malware or office fires.
  4. Off-site Backups – What is the cost of loosing all of your data in a fire?  That is the reason to do off-site backups.  These can be as simple as burning a DVD of your data every week and taking it home with you (if you work from home, somewhere else then), or as complex as an automated, hourly backup to a remote server.  If all of your information is on one server, the easiest thing to do is have two external hard-drives, mark one “Tuesday” and the other “Friday”, and alternate between them.
  5. WPA2 Wireless Security – Everyone knows to use some form of security on their wireless network, but if you are using the old WEP encryption (if your router is over five years old), you might as well not be using any – WEP is too easy to break.  I recommend you forgo ten lattes and purchase a new router.  If your older wireless network cards in your computers do not support WPA2, you can use WPA instead… it’s not as good, but it should be sufficient.
  6. Uninterruptible Power Supplies – These are the big batteries you plug your computer into so, if the power goes out, your computers stay on.  The most important feature of these is that they keep the power constant to the computers if the power “flickers” in the building.  That “flicker” ruins computers.  Buy a UPS for every computer in the office, and buy a “heavy-duty” one for the servers.  There is no reason to loose a $1500 piece of hardware for want of an $50 battery.  Indeed, the power just flickered while I was writing this, and my computer, on a battery backup, stayed on.
  7. Client Certificates or Keys for Remote Access – If you can access your office computer from home with only your password, you need better protection.  If that password gets out (and it will), you have compromised all of the data you can access from that machine, which is much more than you realize.  Think of a remote client certificate or key as a five-hundred character password that you don’t need to remember, and is unique for a particular computer.  If the “password” is ever lost or stolen, it is easy to revoke only that certificate or key, while keeping the certificates and keys you have installed on other machines.
  8. Defense in Depth – This refers to having more than one line of “sufficient security” between you and the Internet.  There should be two firewalls between you and the Internet, the firewall inside your computer, and the firewall between your computer and your Internet connection (usually a correctly-configured router).  There should also be two levels of anti-malware detection on your email, the first on your email server, and the second on your desktop; if your email provider does not provide virus detection, switch.  The reason for doing this is there are often undiscovered vulnerabilities in any single solution, and having two layers of protection protects you from these vulnerabilities.
  9. TLS/SSL for Email and Web Applications – You should encrypt the connection between you and the servers you use, even if they are inside your internal network.  Many connections you use without thinking about it (such as connecting to a Windows server for files or Remote Desktop) are usually encrypted by default, and the three that are most often not encrypted are your email, your web browsing, and any instant messaging client you use.  You want to encrypt these connections, particularly when you send your username and password to access the services; you might not care about someone intercepting your Facebook profile edits, but you should care about users intercepting that Facebook password, because users reuse passwords, and that “Unger50Todd%” sent over the Internet in Plain-Text might also be the very secure password for the Windows Domain that allows Remote Desktop (another good reason to use certificates).  For email, ask your email provider to help you enable encryption, and if they don’t offer it, switch email providers.  For web sites, try putting https:// instead of http:// when connecting to web servers before you send your password. If the web site does not offer encryption, and it is not essential to your business, use a different password for that service, and write it down somewhere.  You are sending it in plain-text across the network anyway, so you shouldn’t worry too much about it.  If the site is essential to your business (such as a supplier), talk to the webmaster; there is probably a way to connect securely that is simply not “obvious”.
  10. “Browser Based” Internal Applications – There is a whole class of small to medium size business applications that you can install on an “internal server” and access through a web browser.  Given two “all else equal” alternatives between an application that you install on local computers, and an application you must set up a “Web Server” inside your office to run, go with the server-based application.  The reason is simple, the application is (or should be) client-neutral, so you can connect to it from any computer that runs the required web browser.  More importantly, you only need to worry about updating and protecting the application on one location instead of multiple locations, which makes the application much cheaper to manage.  Furthermore, if you must develop an internal application for your business, insist that it be developed as a browser-based application.  If they put up fuss about having to run a web server, say “that’s fine, you can pick one up for $400”.  If they still refuse, find someone else.  Given the speed of connecting to a server over an internal network, and the development of Ajax applications that seem “desktop-like”, there are very few reasons to build custom internal applications.  Custom web-based applications are much less expensive to develop, and inherently more secure because the application connects to the central database directly instead of over the network (and I have yet to see an internal application that was not a database application).

If there are any differences of opinion, I would like to hear about it.

Categories: Choosing Software Tags:

Is Open Source Right for My Small Business?

July 7th, 2009 No comments

“Is Open Source Software right for my small business?”

If you are asking the question, the answer is yes!

In fact, you are using it right now – this web server runs Linux (operating system), Apache (web server), and WordPress (blogging framework).  All of these programs are Open Source.  For something closer to your small business, if you have a web host, and you don’t know what software they are running, it is 99% Linux and Apache (you have to pay “extra” for a Windows host).  If they deal with your email, and you did not pay “extra” for Exchange, then that server is running open source software.  Your ISP is running open source software, and if you have a Linksys router, you are running open source software.

The question really is, “Should I use more Open Source software than I do now?”  To answer this question, you need to understand what “Open Source” is, and more importantly, what Open Source is not.

Isn’t Open Source just Free Software?

No!… if by “free” you mean “costs nothing”.  Open Source advocates like to distinguish between “free as in speech” and “free as in beer”, but at this point, the small business owner does not care.  At this point, you mean price.  (Freedom with software is important, but I will talk about it later.)

First, some “Open Source” software costs money out of the box… usually coupled with hardware.  Your Linksys router is a good example of this, and the endian firewall appliance is another.  These companies release the source code for their product, knowing that the value added through the appliance is sufficient for you to purchase it.

Secondly, you need to consider all of the costs associated with a particular piece of software, not just the “shrink-wrap price”.  Microsoft did a large marketing campaign a few years ago about “total cost of ownership”, claiming Windows cost less to own and operate than Linux when you added in the “hidden” costs of installation and support.  For large business and data centers, Microsoft’s claims are a load of… garbage.  At scales beyond twenty servers, Linux becomes much cheaper since Microsoft’s per unit costs stay about the same as you add units, while Linux per unit costs shrink to the cost of hardware alone.  That’s why Google, and your ISP, use Linux.  However, at smaller scales such as your small business, this becomes a serious factor to consider, not just for Linux, but for all Open Source software.

Isn’t Open Source just a Hobby for Uber-Geeks?

No.  Some Open Source software starts that way, a geek wants to solve a particular problem and does so, and then puts the solution on his or her website for everyone to use.  A good example of this kind of software is the software on my personal website, where I have created scripts and add-ons that solve a particular problem I have, and then shared the results.  Other Open Source software starts as a commercial enterprise, and the vendor builds a business model around offering support and other services for the software.  MySql, the most widely used relational database system, is an example of this type of software.  Most of the software you, as a small business owner are concerned with, is a hybrid of both – a “grand commercial design” that has been “filled out” by individual users adding certain features.  OpenOffice.org is an example of this sort.  Of course, there’s also Linux, which is all of the above.  It started as a “geeky project”, but quickly expanded into several “commercial flavors” that are then extended by non-commercial developers and so forth and so forth.

The point is that many pieces of Open Source software are meant for end-users that have varying degrees of technical skills, and developed (and supported) with varying degrees of profit motive.

Isn’t Open Source just Linux?

No.  Linux is only one example of Open Source software, but their are Open Source office suites (OpenOffice.org), Open Source image editors (GIMP), Open Source email clients (Thunderbird), Open Source web browsers (Firefox), etc.

In fact, there is very good Open Source software that only runs on Windows (which I have ranted about here and here).

Then, What is Open Source Software?

It depends who you ask, but ultimately it is a software license, stating, among other things:

  1. you have access to the non-obscured, original set of human readable instructions that create the software
  2. you may run the program for any purpose
  3. you may modify the program as you wish
  4. you may share your modifications, and the original, if you choose to, with anyone for any purpose, under the same license

For small businesses that aren’t in the development business, Open Source is essentially a feature, but an important one.  It means, among other things,

  • You do not need to purchase the software for every computer you use it one
    • You do not need to maintain records for every computer you use to ensure you comply with the license
  • You can customize (or pay someone else to customize) the software to fit your particular need, if the software doesn’t quite fit.
  • You do not have to rely on a particular vendor to maintain the software
    • A vendor cannot charge huge, ongoing fees for “maintenance contracts”
    • If the vendor goes out of business, you are not “hung out to dry”, and can still install the software on new computers, and “update” the software to work on new platforms.

Like any other feature, you should decide what Open Source is worth to you as a feature, for each piece of software you use, add it to your value of the other features, and compare it to the alternatives.  Then, you should compare the total cost of using the Open Source software and the alternatives.

Total Cost of Ownership of Technology Solutions for Small Business

While obtaining Open Source software is usually the cost of downloading it, that is not the only cost of using a piece of software.  The primary costs of Open Source solutions are usually the following:

  1. Information about the product (does it exist, should I use it?)
  2. Software Compatibility
  3. Installation
  4. Support
  5. Training (including “figuring it out” time)

Information about the Product

Open Source demands greater search costs, starting with the search cost of discovering the existence of the software, and then the cost of deciding whether or not it is right for you.  Unlike proprietary vendors, open source solutions usually do not have large marketing teams to throw the software in front of you and tell you it is the greatest thing ever… you have to head to the Internet, search for the software, read reviews, and ultimately download and play with it.  While diligent users do this for all software they use, most of us just use what is put in front of us when it comes to software, and what is in front of you is likely not Open Source.

An IT consultant can help you with this.  If you already have one, ask him if there are any Open Source alternatives to the software you are buying.  Most of us are into “playing with software”, and some of the software you are using may already be Open Source.  If you don’t have an IT consultant, and you are a “one-person” shop, ask a client or search for one in your area.  The other alternative is to search Google for NAME OF SOFTWARE YOU USE and “Open Source”.  For example, “open source ms project” takes you to Open Workbench‘s website.

Software Compatibility

This is the largest barrier for Open Source software… whether or not it works with “everyone else’s” software.  This is most evident when considering OpenOffice.org to replace Microsoft Office – everyone that you share documents with is likely using Microsoft Office, and while OpenOffice.org is 99% compatible with the older version of MS Office, and 97% compatible with the new version, it is still not perfect, especially with spreadsheets and databases.  However, OpenOffice.org is 99.99% compatible with Google Docs, so the problem is definitely shrinking.

That said, not all of the programs written for Microsoft Office will work with OpenOffice, so if you have a specialty program that works with Microsoft Office, you need to consider finding an alternative to that program a cost.

Installation

Open Source software is sometimes harder to install than proprietary software, simply because it has so many more options than proprietary software.  When it comes to desktop software, just accept the defaults and don’t worry about it.  For server software, you will need a professional to install it for you.  However, you should have a professional helping you install Windows servers too (though that professional will cost you less for Windows).

Support

Support is sometimes more expensive for Open Source software, simply because fewer people use it to do the common tasks, and those that do use it know the “tricks” for free support.  Those that can support the software usually have a “higher level of knowledge” than people that only support the common, proprietary options, so they cost more.  As the number of items you use goes up, the “higher level of support” cost goes down per-unit, while the licensing remains the same.  Commercial Open Source is the exception, where the primary revenue stream is support services.  Here, product support is comparable to proprietary products, but you usually pay for them “per incident” instead of them being included (sometimes) in the purchase price of the product.  There are also several companies that provide support for Open Source software, such as Open Logic.

However, support for “non-training issues” is needed much less frequently with Open Source software than it is for proprietary software, since the “technical quality” tends to be higher in Open Source software.  Open Source software also does not suffer from the “never-ending cascade of malware” that users of Microsoft Windows and Microsoft Office are so intimate with, so you can take those costs off the top as well.

Training (including “figuring it out” time)

It is a common misconception that Open Source software is more difficult to use than proprietary software.  Some of it is, but most of it is just different, especially on the desktop.

OpenOffice.org has a different interface than Microsoft Office, but frankly I find the ribbon in the new version of Microsoft Office a pain in the neck to use.  Firefox is just as easy to use as Internet Explorer, but the “advanced features” are configured in a different location, and some features need to be installed via “add-on” instead of simply enabled (but there are many more features available).  Comparing Thunderbird to Microsoft Outlook has the same issue, if you are used to Outlook, Thunderbird is very hard to use.  However, I use Thunderbird, and using Outlook is disconcerting to me.

If you are being “forced to upgrade” to something new, that is the time to try Open Source software to minimize the training cost.  The perfect time to give OpenOffice.org a spin is when you are looking to move to Office 2007, and the best time to try a Linux Desktop is when you are looking with dread at Vista.

Where to go from here…

If you want to give Open Source a try without breaking what works for you now, try Firefox.  It is a very slick Open Source web browser.  For something more meaningful, try OpenOffice.org for a week without getting rid of your current office solution.  Even if you need some compatibility feature, and have to stick with Microsoft Office for that reason, you will have a “feel” for what most Open Source software is like to work with – better in many respects, worse in others, but above all, a little different.

If you are thinking about alternatives, for servers or “total solutions”, you need to talk to an IT professional.  Find one that leans towards Linux but knows Windows, and is not religious about it, and one that will honestly sit down with you and evaluate your business.  For smaller organizations with only one “general purpose” server, a Windows or Mac server is usually a good choice.  However, if you need a server for just one thing (like file sharing), or you have a specialist need (such as version control), Linux servers will likely work better for less money.

When it comes to Open Source software, the Google search is your best friend.  Use it to find alternatives to expensive software solutions others propose, and use it to see if someone has created the software you need to fix “that one little problem”.  The initial returns on a little bit of searching are very high, no matter what solution you go with.

Categories: Choosing Software Tags: