woanders ist es auch schoen

datasci-1903 datasci-1903 - It's actually quite easy to develop and test the pig-scripts locally, see Ge Peng's instructions in this thread: (Pig obviously includes Hadoop and can be run locally out of the box; i.e. there is no need for a VM if Java is installed)

cd mkdir pig cd pig curl -o pig-0.13.0.tar.gz tar xzvf pig-0.13.0.tar.gz cd ~/intrDataScience/datasci_course_materials/assignment4 ~/pig/pig-0.13.0/bin/pig -x local grunt> register ./pigtest/myudfs.jar grunt> raw = load './cse344-test-file' USING TextLoader as (line:chararray); grunt> ntriples = foreach raw generate FLATTEN(myudfs.RDFSplit3(line)) as (subject:chararray,predicate:chararray,object:chararray); grunt> describe ntriples ntriples: {subject: chararray,predicate: chararray,object: chararray} grunt> line1 = limit ntriples 1; grunt> dump line1