March 07, 2012

CDR Processing - with Flume/HDFS

This post is in continuation of earlier posts related to CDR Processing.

http://ssklogs.blogspot.in/2011/11/hadoop-and-cdr-processing.html

http://ssklogs.blogspot.in/2012/01/hadoophbase-based-cdr-processing.html


So far we saw the processing of the CDRs using Hadoop/Hbase. What about the receiving part of the CDRs from various Network elements. There is no standard CDR format/delivery mechanism and each vendor follow their own.

To collect these CDRs, we need to follow the same bigData approach. Thats where flume comes into picture.

Using flume, we can setup a node(s) to collect the CDR from one or more NEs. Perform any transformations, like changing the format,adding/removing fields etc and store it into HDFS for further processing.

Flume has the Source-Decorator-Sink concepts. Source is the NE element which can FTP the files to the designated Agent Node`s FileSystem. The Decorator does the job of any transformations. The Sink writes to the HDFS. Flume comes with a good catalog of Source/Sinks/Decorators. We need to just write the decorator plugin for any custom transformation.

Lets the take CDR format used in previous posts. Assume we need to prepend the TNs with the Country Code before sending to Hadoop processing.

Here are the steps to use Flume for CDR Processing.

- Install Flume 0.9.4 binaries
- Download Flume 0.9.4 Source (to develop custom decorator plugins)
- Copy the hadoop core jar from the hadoop installation to the flume lib directory (remove the flume one)
- For this exercise, the flume master and node are running on the same machine as Namenode. But one can distribute these to different machines if needed
- Write the custom decorator plugin using the helloworld sample provided. In the append() method , get the event body and do the required transformations.

@Override
  public void append(Event e) throws IOException, InterruptedException {
    System.out.println("cdrDeco -> " + new String(e.getBody()));
    StringTokenizer st = new StringTokenizer(new String(e.getBody()), "|", true);
    StringBuffer newRec = new StringBuffer();
    String fromTN = st.nextToken();
    newRec.append(fromTN);
    newRec.append(st.nextToken());
    newRec.append(st.nextToken());
    newRec.append(st.nextToken());
    newRec.append(st.nextToken());
    newRec.append(st.nextToken());
 

    //Append Country code
    String toTN = st.nextToken();
    toTN = "+91" + toTN;   
    newRec.append(toTN);

    EventImpl e2 = new EventImpl(newRec.toString().getBytes(),
        e.getTimestamp(), e.getPriority(), e.getNanos(), e.getHost(),
        e.getAttrs());

    super.append(e2);
  }



- Build the plugin jar using 'mvn package'
- add the jar to FLUME_CLASSPATH (bin/flume file)
- add the plugin to conf/flume-site.xml
 
    flume.plugin.classes
    cdrDeco.CDRDeco
    Comma separated list of plugin classes
 


- Start the master (bin/flume master)
- Access flume web console (flumeMaster:35871) and make sure the plugin shows up under "extn" tab

- Set the node`s source-decorator-sink configuration as below using "config" tab on master console

nodeHostName: tailDir( "/tmp/switch1" ) | { cdrDeco => escapedFormatDfs( "hdfs://nameNode:9000/user/cdr/input/switch1", "CDRRecs", raw() ) }

- Start the Node (bin/flume node)


- Now the node will poll the source directory and perform the transformation as mentioned in cdrDeco and sink it to the HDFS directory

February 28, 2012

Tiggzi / 5-min mobile apps

Tiggzi, the cloud based mobile app development IDE, amazed me for its simplicity, ease of use and innovation. Its so intuitive that I didn't have to refer to the manual , Quick start guides or sample "Hello World" examples. The apps can be built using HTML5, JS, JQuery, REST etc.

I change my wallpaper often, mostly nature/landscapes. Wanted an easy way to change my wallpaper on demand. Used tiggzi to create an app which will show me a random image for a tag from picasa, so that i can set it as my wallpaper.

It really just took 5 mins only. Ofcourse this doesnt include the time it took to analyze how picasa REST api works, the input/output etc. Once that info is there, its all just wiring, which reminded me of the good old days of Visual Basic. The best thing is that tiggzi comes with REST service builder, using which one can just map the request/response of the Service to the UI components. No Coding required.

For example, for this app, all i have to do was

- Provide the picasa REST service URL
- Add request param details.
- Test using sample data. Use the output as the service`s response format
- Build the UI - Drag and Drop Image, Button and TextInput
- Use Data mapping tool, to map the TextInput to the REST input param (query field)
- Use Data mapping tool, to map the Response XML field (image URL) to the Image component 'src' property
- Set the action for Button click to invoke the above service

Thats it. It worked ! Another best thing is that one can export this HTML5 app as an android app too. Either as apk binary or source format, if customization is required.

Following are some screenshots of the app running on my mobile as HTML5 as well as an android app.

Running as HTML5 app inside a browser













Running as an android app



February 27, 2012

JBoss EAP 4.3 / "setProperty must be overridden ..." error

The error "setProperty must be overridden by all subclasses of SOAPMessage" occurs due to the usage of the SOAPMessage class from JDK runtime jar instead of the Jboss one, jboss-saaj.jar. There are several workaround suggested, most prominent one is to move Webservice related jars to lib/endorsed of Jboss.
For some reason, this workaround didnt work for me. Verified using -verbose option and it was always loading it from JDK rt.jar.

I had to keep the jboss-saaj.jar prepended to the Bootclasspath of JDK using "-XBootclasspath/p" option in run.bat. This fixed the issued for me.

Jboss EAP 4.3, JDK 1.6

February 24, 2012

OpenSSL memory tuning

We have a TCP server handling large number of clients over SSL. Recently we found a valuable option that can reduce the memory footprint of SSL connections.

http://www.openssl.org/docs/ssl/SSL_CTX_set_mode.html (SSL_MODE_RELEASE_BUFFERS)

February 22, 2012

IaaS Cloud Standards / deltaCloud

We have so many IaaS vendors but the pain point is that currently there is no standard management interface - to manage the instances, images etc. Each vendor has their own API.

DMTF has a cloud working group and working on a standard called CIMI and its in draft phase. Once this comes out and the member companies start adopting it, then much more nicer converged management tools can be targetted for the enterprise customers. The IaaS Vendors can continue providing their native APIs having unique features along with CIMI, same way how DB vendors provide JDBC and native APIs like OCI.

Till that time, we have the Apache deltaCloud project which provides a native/CIMI based API gateway to communicate with 10+ IaaS vendors.

Following are the steps to install and use.

- install ruby 1.9.1
- install rubydev 1.9.1
- install rubygems

- Make sure libxml, libxslt and corresponding dev packages are there

- Using ruby gem command, install the following
    ruby-client
    sinatra (needed for CIMI web gui client)
    sinatra-content-for (needed for CIMI web gui client)
    deltacloud-core

- To use, CIMI web GUI client, we need the deltacloud source. Get it from git repo. The source package contains CIMI web client. This is not required if custom client or curl is going to be used.

- Now to use it with EC2, we need the API key ID and corresponding secret key.


- Start the deltaCloud gateway with CIMI as NorthBound API and EC2 as SouthBound API.

   deltacloudd --cimi -i ec2 

Now we can use the CIMI REST/JSON interface to manage the EC2 account.

For example, to list all instances using CIMI
     curl --user "EC2KeyID:EC2SecretKey" "http://localhost:3001/cimi/machines?format=json"

and the output will be like
{"uri":"http://localhost:3001/cimi/machines","name":"default","description":"Ec2 MachineCollection","created":"2012-02-22 01:24:45 +0530","machines":[{"href":"http://localhost:3001/cimi/machines/instanceID"}]}


To get details about an instance,
    curl --user "EC2KeyID:EC2SecretKey" "http://localhost:3001/cimi/machines/instanceID?format=json"
The output contains details on the EC2 Account Number, VM details, Image details, Volumes etc

CIMI is for management part of the IaaS. What about the standard for the runtime VM image itself and their portability. Since virtual appliances are gaining momentum, a standard way of packaging and distributing is a must. Here too there are couple of standards,

http://www.dmtf.org/standards/ovf
http://tosca-open.org/

February 20, 2012

Jboss Tools 3.x / Eclipse Indigo / Target Runtime

When using Jboss Tools to create J2EE projects, if JBoss runtimes (community or enterprise) doesnt appear in the Eclipse (Indigo) runtime drop down list, make sure to check the JDK version of Eclipse is atleast 1.6. If running with JDK 1.5, Jboss Tools (3.x+) doesnt seem to load up. Changed it to JDK 1.6 and am able to see them now.

February 13, 2012

Oracle Weblogic Active GridLink

This post is in continuation to the earlier RAC post. Oracle Weblogic has added a new way to access RAC, called Active GridLink by integrating the ONS features in the appserver layer. This way the appserver datasources can be more intelligent in knowing the following

- Real time load of RAC instances (MDS approach use round robin)
- DB instance status for FCF

The main advantage is that one doesnt have go with a fixed scheme of choosing either actual load based  balancing (SSLB) or pinning instances (at appserver). Instead the appserver can now decide dynamically on what is the best way to serve the connection request, based on the ONS feedback and current global transaction needs.

http://www.youtube.com/watch?v=8D6cf6Y5z94 (Demo)

February 07, 2012

JBoss EPP 5.2 / "Error installing to Parse" errors

If EPP 5.2 out of the box startup throws several errors like ClassNotFound , "Error installing to Parse: name=vfsfile jboss-service.xml " etc, the fix is to unzip the install package properly.

In my case, the server/default/lib was missing. Starting from 5.x, these libs have been moved to common dir. But still the empty dir is required and for some reason winzip didnt create it from the package. But winrar did.


February 02, 2012

Android / Bluetooth / Hotspot / File Transfer

I couldnt make the MID apad use my 3G USB modem. Finally found a way to use 3G using my Nokia X6. There is a excellent app called JoikuSpot available for Nokia to run a Wifi Hotspot. It works great and my laptops were able to connect and use the Wifi Hotspot.

But unfortunately the apad couldnt recognize the adhoc Hotspot and further googling revealed that I need to root my apad and enable adhoc wifi connections. Need to try it.

Few days back I got a Samsung Galaxy S with Android 2.3.5. To my surprise (shock!) I got to know that bluetooth access to the phone from PC (using Samsung Kies software) is not supported ! I couldnt even send photos using bluetooth to PC (This is where I like Nokia and their PC Suite software!). Finally found a nice app on Android market to run a Wifi based FTP server. Now am able to transfer the files from phone to PC.

January 28, 2012

Oracle RAC 10g / JDBC / HA / Load Balancing

Using Oracle RAC  with JDBC to enable HA/Load Balancing of Database Nodes is a tricky one. Following are the different ways that I have used to configure the JDBC URLs to use with RAC based Database.


 1) Pinned DB Instance/Listener with failover / No Server side Instance Load Balancing

   In this approach, the appserver nodes are pinned to a primary DB node. For example, if you have two appservers (app1/app2) and two DB nodes (db1/db2), then pin app1 to db1 to be used as primary DB node and db2 as secondary DB node (for failover) and vice versa for app2

On App2, the JDBC URL may look like

  jdbc:oracle:thin:@(DESCRIPTION=(LOAD_BALANCE=off)(FAILOVER=ON)(ADDRESS=(PROTOCOL=TCP)(HOST=db02-vip)(PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=db01-vip)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=dbServiceName)))


Pros
    - Interconnect traffic is minimized since global cache data doesnt need to be frequently exchanged between DB nodes
    -  2PC can be avoided thus eliminating the dreaded "in-doubt distributed transaction" exception (ORA-1591)

Cons
    -  DB node load may not be even if appserver load is uneven
    -  Relatively longer failover time for the appserver connection pool to relinquish all the connections from dead node to active node, since all connections will be pointing to the primary node

2) Pinned Listener with failover / Server side Instance Load Balancing


   In this approach, the appserver nodes are pinned to a primary listener (not the instance as above). Server Side Load Balancing must be enabled using REMOTE_LISTENER and Load Balancing Advisory options. In this method, any listener can be used to reach any of the DB instances (SIDs) and decision is based on the Load balancing advisory done at server side. No change in the client URL.

  jdbc:oracle:thin:@(DESCRIPTION=(LOAD_BALANCE=off)(FAILOVER=ON)(ADDRESS=(PROTOCOL=TCP)(HOST=db02-vip)(PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=db01-vip)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=dbServiceName)))


Pros
    - More interconnect traffic
    - 2PC and "in-doubt transaction" exceptions (ORA-1591) can occur since the appserver connection pool can hold connections pointing to multiple instances

Cons

    -  DB node load balancing is relatively more accurate since LBA looks at actual load and routes the connection request at server side
    -  Relatively shorter failover time for the appserver connection pool since connections comes from multiple instances
    - Listener load may not be even if appserver load is uneven  

3) Driver based Listener Load Balancing  / Server side Instance Load Balancing


   This approach is similar to the previous one except the listener traffic is also load balanced (round robin) by the JDBC driver.
  
  jdbc:oracle:thin:@(DESCRIPTION=(LOAD_BALANCE=ON)(FAILOVER=ON)(ADDRESS=(PROTOCOL=TCP)(HOST=db02-vip)(PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=db01-vip)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=dbServiceName)))


4) Using Oracle SCAN (From 11g onwards)

   This approach is similar to above but the pain of configuring individual nodes/their VIPs in JDBC URLs are eliminated. All we need is a Round Robin enabled DNS name resolution and use the SCAN DNS name in the JDBC URLs.

  jdbc:oracle:thin:@dbScanName:1521/dbServiceName


Note: Apart from above methods which are mostly generic, each appserver vendor have their own ways to use RAC DB. Example is Weblogic MDS.