Saturday, February 4, 2017

Simple Spark Streaming with Kafka in Docker

I have tried to use simple way of Data streaming in Spark from Kafka using Docker environment. For this test you would need Docker installed and has access to Internet.

1. First start the docker (I am using boot2docker in Windows environment) as:
$ docker-machine start default
$ docker-machine ssh default or ssh docker@localhost -p 23

2. Inside docker environment load the Spark docker images (for this example I have used sequenceiq/spark)
$ docker pull sequenceiq/spark

3. his will pull the latest docker images to your docker environment. Once pull is complete run the following command in docker
$ docker run -it -p 4040:4040 -p 2181:2181 -p 9092:9092 -v /<some_path_to_share>:/data sequenceiq/spark:1.6.0 /bin/bash

following this command the spark image console is appear.

4. Now download the Apache Kafka from (https://kafka.apache.org/downloads) page. Choose suitable kafka version. For this example purpose, I downloaded kafka_2.10-0.10.1.0.tgz for Scala version 2.10

5. Unzip the file as:
$ tar -xvf kafka_2.10-0.10.1.0.tgz
$ mv kafka_2.10-0.10.1.0 /usr/local/kafka
$ cd /usr/local/kafka/bin

5. Now after download and unpack and move to suitable folder, start zookeeper and kafka from bin folder as below:
$ ./zookeeper-server-start.sh /usr/local/kafka/config/zookeeper.properties
$ ./kafka-server-start.sh /usr/local/kafka/config/server.properties

6. After the zookeeper and kafka sucessfully started, create the topic for streaming as:
$ ./kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic spark-topic
- this will create topic

7. Next thing is to create the sample Spark application as:
import kafka.serializer.StringDecoder
import org.apache.spark.SparkConf
import org.apache.spark.streaming.kafka.KafkaUtils
import org.apache.spark.streaming.{Seconds, StreamingContext}

object KafkaReceiver {

  def main(args: Array[String]): Unit = {

    val conf = new SparkConf().setMaster("local[*]").setAppName("Kafka Receiver")

    val kafkaParams = Map("metadata.broker.list" -> "localhost:9092")

    val topics = List("spark-topic").toSet

    val ssc = new StreamingContext(conf, Seconds(5))

    val lines = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](ssc, kafkaParams, topics).map(_._2)

    println("printing Streaming data.......")

    lines.print()

    ssc.start()
    ssc.awaitTermination()
  }
}

8. Package this with required libraries as in build.sbt below

name := "SparkExample"

version := "1.0"

val sparkVersion = "1.6.0"

scalaVersion := "2.10.4"

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % sparkVersion,
  "org.apache.spark" %% "spark-sql" % sparkVersion,
  "org.apache.spark" %% "spark-streaming" % sparkVersion,
  "org.apache.spark" %% "spark-streaming-kafka" % sparkVersion
)


9. Run the jar after packaging from spark-submit as:
$ spark-submit --master local[*] --class KafkaReceiver --packages org.apache.spark:spark-streaming-kafka_2.10:1.6.0 /sparkexample_2.10-1.0.jar

once it start running check the console

10. Finally run the kafka producer to produce the data for streaming as:
$ ./kafka-console-producer.sh --broker-list localhost:9092 --topic spark-topic
Hello Spark Streaming

In Spark console you will see something like:

17/02/04 21:55:50 INFO executor.Executor: Running task 0.0 in stage 11.0 (TID 11)
17/02/04 21:55:50 INFO kafka.KafkaRDD: Beginning offset 1000011 is the same as ending offset skipping spark-topic 0
17/02/04 21:55:50 INFO executor.Executor: Finished task 0.0 in stage 11.0 (TID 11). 915 bytes result sent to driver
17/02/04 21:55:50 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 11.0 (TID 11) in 13 ms on localhost (1/1)
17/02/04 21:55:50 INFO scheduler.DAGScheduler: ResultStage 11 (print at KafkaReceiver.scala:27) finished in 0.012 s
17/02/04 21:55:50 INFO scheduler.DAGScheduler: Job 11 finished: print at KafkaReceiver.scala:27, took 0.063248 s
-------------------------------------------
Time: 1486263350000 ms
-------------------------------------------
Hello Spark Streaming

17/02/04 21:55:50 INFO scheduler.JobScheduler: Finished job streaming job 1486263350000 ms.0 from job set of time 1486263350000 ms
17/02/04 21:55:50 INFO scheduler.JobScheduler: Total delay: 0.115 s for time 1486263350000 ms (execution: 0.079 s)
17/02/04 21:55:50 INFO rdd.MapPartitionsRDD: Removing RDD 21 from persistence list
17/02/04 21:55:50 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 11.0, whose tasks have all completed, from pool
17/02/04 21:55:50 INFO storage.BlockManager: Removing RDD 21
17/02/04 21:55:50 INFO kafka.KafkaRDD: Removing RDD 20 from persistence list
17/02/04 21:55:50 INFO scheduler.ReceivedBlockTracker: Deleting batches ArrayBuffer()
17/02/04 21:55:50 INFO scheduler.InputInfoTracker: remove old batch metadata: 1486263340000 ms
17/02/04 21:55:50 INFO storage.BlockManager: Removing RDD 20




Wednesday, February 11, 2015

Simple Secure websocket backend server connection with Apache SSL Proxy

BROWSE <-----SSL connection-----> APACHE PROXY SERVER (SSL Enabled) <----Regular plain text----> TOMCAT WEB SOCKET SERVER

1. Enable following modules:
LoadModule ssl_module modules/mod_ssl
LoadModule socache_shmcb_module modules/mod_socache_shmcb.so

2. Create self-signed digital certificate for testing as follow using openssl
> openssl genrsa -out server.key 2048 (generate private key)
> openssl req -new -key server.key -out server.csr (generate CSR)
> openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt (generate self-signed certificate from CSR)

and copy the crt and key file at: C:/Apache24/conf/ folder

3. Enable the httpd-ssl.conf in the httpd.conf file as:
Include conf/extra/httpd-ssl.conf

4. Open httpd-ssl.conf file and update the following (comment Listen 443 line if starting server gives error about the port already used by 443)
SSLCertificateFile "${SRVROOT}/conf/server.crt" and
SSLCertificateKeyFile "${SRVROOT}/conf/server.key"

5. Start the server using ./httpd.exe and see if there is any issue, if no issue then access the page using https://localhost, this will launch the secure page

6. Add the Proxy in the httpd.conf file for the Tomcat as:
ProxyRequests Off
ProxyPreserveHost On
ProxyPass /test http://tomcat-host-ip:8090/test
ProxyPassReverse /test http://tomcat-host-ip:8090/test

Here Tomcat is running at 8090 port and Proxy is set up as http, i.e. from apache to tomcat, the data is passes as plain text.

7. To automatically change the http to https when user access http, add following line in the httpd.conf file just above the ProxyPass set up as:
RewriteEngine On
RewriteCond %{HTTPS} off
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI}

and all set, run the Apache and Tomcat and access the Tomcat page from Apache as https://localhost/text. This will be Proxied to Tomcat as desired.

=========================================================================

FOR TOMCAT AJP CONNECTOR

1. Enable tomcat AJP1.3 in server.xml in Tomcat's configuration directory
2. Load the module
mod_proxy and mod_proxy_ajp in httpd.conf file in Apache 2.4
3. Set the Proxy setup as:
ProxyPass /test ajp://tomcat-host-ip:8090/test
ProxyPassReverse /test ajp://tomcat-host-ip:8009/test

Simple unsecured Apache Proxy for Tomcat Web Socket Application

BROWSE <===> APACHE PROXY SERVER <===> TOMCAT WEB SOCKET SERVER

This is tested for the Apache Haus v 2.4.10 with Tomcat v 7 in the Windows development environment.

1. Install Apache Haus 2.4.10 in Windows Environment.
2. Install Tomcat 7 in the Windows Environment either in same machine or different machine.
3. To create websocket proxy for the Apache load following module by removing LoadModule comments in httpd.conf file:

LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_wstunnel_module modules/mod_proxy_wstunnel.so

For further, please refer Apache documentation at: http://httpd.apache.org/docs/2.4/mod/mod_proxy_wstunnel.html and add following ProxyPass setup in httpd.conf configuration file

ProxyRequests Off
# If tomcat is in same machine (for websocket)
ProxyPass /websocket/echo ws://localhost:8090/websocket/echo          -> look for the trailing "/"
ProxyPassReverse /websocket/echo ws://localhost:8090/websocket/echo             -> look for the trailing "/"

# For other http request
ProxyPass /test http://localhost:8090/test
ProxyPassReverse /test http://localhost:8090/test

# If Tomcat is in different/remote machine
ProxyPass /websocket/echo ws://remote-host:8090/websocket/echo
ProxyPassReverse /websocket/echo ws://remote-host:8090/websocket/echo

# For other http request
ProxyPass /test http://remote-host:8090/test
ProxyPassReverse /test http://remote-host:8090/test

This configuration assume that Apache use port 80 and Tomcat use port 8090 and servlet context deployed in tomcat /test. Now the http or ws call to Apache will automatically redirect to corresponding Tomcat Server once it find the Proxy setting. Also take special look at the trailing "/". If you add trailing "/" for /websocket/echo/, then you need to add trailing "/" for ws://localhost:8090/websocket/echo/ too or vice-versa.

Also the ProxyPass has to be set up in specific order (take a look at the Apache documentation for this).

Here is how someone in Stack Overflow did (very useful): http://stackoverflow.com/questions/17649241/reverse-proxy-with-websocket-mod-proxy-wstunnel

4. Create simple websocket application in Tomcat as:
@ServerEndpoint("/echo")
public class EchoWebSocket {
    private Logger logger = Logger.getLogger(this.getClass().getName());
    @OnOpen
    public void onOpen(Session session) {
        logger.info("Connected .... " + session.getId());
    }
    @OnMessage
    public String onMessage(String message, Session session) {
        switch (message) {
        case "quit":
            try {
                session.close(new CloseReason(CloseCodes.NORMAL_CLOSURE,
                        "Connection is closed."));
            } catch (IOException e) {
                throw new RuntimeException(e);
            }
            break;
        }
        return message;
    }
    @OnClose
    public void onClose(Session session, CloseReason closeReason) {
        logger.info(String.format("Session %s closed because of %s",
                session.getId(), closeReason));
    }
}

*Remember when deploying the Tomcat application, don't put websocket-api.jar library in the application classpath. Since this library is already included in the Tomcat library, adding this in application class-path may not work (At least for me it didn't work).

5. Add the JavaScripts HTML5 WebSocket API script as:

function wsocket() {
    var ws = null;
    var wsProtocol = (window.location.protocol === "https:" ? "wss" : "ws");
    var target =  wsProtocol + "://" + window.location.host + "/dwrdemo/echo";
    if ("WebSocket" in window) {
        ws = new WebSocket(target);
    } else if ("MozWebSocket" in window) {
        ws = new MozWebSocket(target);
    } else {
        alert('WebSocket is not supported by this browser.');
    }

    ws.onopen = function() {
        console.log("WebSocket has been opened!");
    };

    ws.onmessage = function(message) {
       console.log("WebSocket message is: " + message.data);
    };

    ws.onerror = function() {
        console.log("WebSocket connection has error!");
    }
   
    ws.onclose = function() {
        console.log("WebSocket is closed.");
    }
}


6. Add the Tomcat deployable war file in the Tomcat Server and test to make sure it is working. Once working access the page using the Apache web server.
7. The Request and response header for the websocket look something like as shown below if connection is established successfully

URL: http://localhost/websocket/echo

Request Headers:
    Accept    text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    Accept-Encoding    gzip, deflate
    Accept-Language    en-US,en;q=0.5
    Cache-Control    no-cache
    Connection    keep-alive, Upgrade
    Cookie    JSESSIONID=3CD55817617CA29EF8F133721A1E1388
    Host    localhost
    Origin    http://localhost
    Pragma    no-cache
    Sec-WebSocket-Key    2cWlxhpbtE0dO1ijeaocaA==
    Sec-WebSocket-Version    13
    Upgrade    websocket
    User-Agent    Mozilla/5.0 (Windows NT 6.1; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0

Response Headers:
    Cache-Control    private
    Connection    upgrade
    Date    Fri, 06 Feb 2015 15:15:01 GMT
    Expires    Wed, 31 Dec 1969 19:00:00 EST
    Sec-WebSocket-Accept    jMl5aXuuLM8xJ35XDGAY8uncIWk=
    Server    Apache-Coyote/1.1
    Upgrade    websocket

Take a look at the Connection Upgrate, the regular http request is updated to websocket for the websocket connection. This is how the websocket work by upgrading the current protocol in the initial handshake.

Testing using JConsole as:

var ws = new WebSocket("ws://localhost/websocket/echo");
ws.readyState

Tuesday, January 27, 2015

Mutual-Authentication with Client Self Signed Digital Certificate with Tomcat SSL Configuration

MUTUAL-AUTHENTICATION WITH CLIENT DIGITAL CERTIFICATE AND TOMCAT SSL CONFIGURATION (FOR DEV and TEST):

In Order to authenticate server and client certificate using mutual authentication in Tomcat SSL configuration, we need to create the pair of keys in server, client. The client certificate is then added to the server trustStore.
1. First generate the server-cert using keytool utility as:
> keytool -genkeypair -alias servercert -keyalg RSA -dname "CN=Your Server DNS,OU=ORG,O=COM,L=NYC,S=NY,C=US" -keypass <password> -keystore server.jks -storepass <password>

usually the keypass, and storepass is kept same. The above keytool command create the server.jks keystore with servercert as alias. The command creates server Private Key and other information and protect the Private key with the password supplied.
To view the content, type the command as:
> keytool -list -v -keystore server.jsk -storepass <password>

2. Next create client keypair as:
> keytool -genkeypair -alias clientStore -keystore clientStore.p12 -storetype pkcs12 -keyalg RSA -dname "CN=Your Client,OU=ORG,O=COM,L=NYC,S=NY,C=US" -keypass <password> -storepass <password>

This will also create the clientStore like the one in 1 with Private Key and certificate.

3. Now export the certificate that is created in keystore as in the step 2 using the command as:
> keytool -exportcert -alias clientStore -file clientStore.cer -keystore clientStore.p12 -storetype pkcs12 -storepass <password>

This command will export the client certificate clientStore.cer file and import this file to trustStore as given below:

4: importing the certificate to trustStore as:
> keytool -importcert -keystore server.jks -alias clientStore -file clientStore.cer -v -trustcacerts -noprompt -storepass <password>

5. Once you have imported the client certificate to the trustStore, you can view the content using the following command as:
> keytool -list -v -keystore server.jks -storepass <password>

6. Now the server-cert keystore can be dropped in the Tomcat {CATALINA_HOME}/config/ directory and enable the SSL authentication using the following Connector configuration:
<Connector port="8443" protocol="org.apache.coyote.http11.Http11Protocol"
    maxThreads="150" SSLEnabled="true" scheme="https" secure="true"
    clientAuth="true" sslProtocol="TLS" keystoreFile="{CATALINA_HOME}/config/server.jks" keystorePass="password"
    truststoreFile="{CATALINA_HOME}/config/server.jks" truststorePass="password" truststoreType="JKS"/>

For the server as well as client certification authentication use clientAuth="true" and required to add trustStoreFile, trustStorePass and trustStoreType. The trustStoreType is by default JKS.

7. Now download the clientStore.p12 file in the client end and install in the browser as client certificate. The certificate will as password, provider the password and it will be all set for the mutual SSL communication.
Now connect the secure page using https and if certificate exception occurs (which will be in Firefox, since this is self-signed certificate), add the exception and you should be able to go to the secure page.
This way the SSL connection established using Mutual Certificate Authentication. For production environment, the certificate need to be verified by CA (Certificate Authority).

8. To create more client certificates, repeat step 2, 3, and import to step 4. This way multiple client certificate can be created for multiple clients for client authentication.


For more information, refer following resources:
http://stackoverflow.com/questions/1180397/tomcat-server-client-self-signed-ssl-certificate
http://docs.oracle.com/middleware/1213/edq/DQSEC/ssl_tomcat.htm#DQSEC164
https://twoguysarguing.wordpress.com/2009/11/03/mutual-authentication-with-client-cert-tomcat-6-and-httpclient/
http://java-notes.com/index.php/two-way-ssl-on-tomcat
http://docs.geoserver.org/latest/en/user/security/tutorials/cert/index.html

Sunday, January 25, 2015

Apache Proxy for Tomcat

ADDING APACHE WEB SERVER AS PROXY FOR TOMCAT
1. Configure the copy of Apache so that it includes the mod_proxy module. In httpd.conf file enable these:
LoadModule proxy_module {path-to-module}/mod_proxy.so and
LoadModule proxy_http_module {path-to-module}/mod_proxy_http.so

2. Next Add Two Directives in the httpd.conf file for each web application that need to forward to the Tomcat as:
ProxyPass /myapp http://tomcat-ip:8081/myapp
ProxyPassReverse /myapp http://tomcat-ip:8081/myapp

More on: http://tomcat.apache.org.tomcat-6.0-doc/proxy-howto.html#Apache_2.0_Proxy_Support

Configuring SSL in Tomcat

CONFIGURE SIMPLE SSL USING TOMCAT
1. Create simple KeyStore file in your machine using following command:

%JAVA_HOME%\bin\keytool -genkey -alias tomcat -keyalg RSA
(default it stores in your \Users directory as .keystore file

OR

%JAVA_HOME%\bin\keytool -genkey -alias tomcat -keyalg RSA \
  -keystore \path\to\my\keystore
 
2. Once the keystore file is created, add the following line in your Tomcat Server.xml file as:
       
<!-- Define a SSL HTTP/1.1 Connector on port 8443
     This connector uses the BIO implementation that requires the JSSE
     style configuration. When using the APR/native implementation, the
     OpenSSL style configuration is required as described in the APR/native
     documentation -->

<Connector port="8443" protocol="org.apache.coyote.http11.Http11Protocol"
        maxThreads="150" SSLEnabled="true" scheme="https" secure="true"
        clientAuth="false" sslProtocol="TLS" keystoreFile="\path\to\my\keystore\.keystore" keystorePass="your_password"/>
       

3. Add Security setting in your application's web.xml file as:

<security-constraint>
    <web-resource-collection>
        <web-resource-name>your_app_name</web-resource-name>
        <url-pattern>/*</url-pattern>
    </web-resource-collection>
    <user-data-constraint>
        <transport-guarantee>CONFIDENTIAL</transport-guarantee>
    </user-data-constraint>
</security-constraint>

4. Access your app using https://localhost:8443/your_app_name
if you access using http://localhost:8080/your_app_name, it will redirect to https because of the web.xml configurations

5. For more information check the Apache Tomcat Document Page

URL REWRITING in TOMCAT

TOMCAT URL REWRITE USING URLREWRITEFILTER BY TUCKEY.ORG
In order to rewrite URL directly from the TOMCAT ROOT context:

1. Copy the urlrewritefilter-4.0.3.jar file to {TOMCAT_HOME}/lib folder

2. Add urlrewrite.xml file in the {TOMCAT_HOME}/webapps/ROOT/WEB-INF folder and write your own rule something like:

<urlrewrite>
    <rule>
        <from>^/([a-z]+)/some_app</from>
        <to type="redirect">/other_app/LoginServlet?param1=$1</to>
    </rule>
</urlrewrite>

3. If you are redirecting the url from your app, then add these line at the top of the web.xml file's Servlet mapping, for ROOT redirect, add these to ROOT/WEB-INF/web.xml file as:
<filter>
    <filter-name>UrlRewriteFilter</filter-name>
    <filter-class>org.tuckey.web.filters.urlrewrite.UrlRewriteFilter</filter-class>
</filter>
<filter-mapping>
    <filter-name>UrlRewriteFilter</filter-name>
    <url-pattern>/*</url-pattern>
    <dispatcher>REQUEST</dispatcher>
    <dispatcher>FORWARD</dispatcher>
</filter-mapping>

4. Now access the page using http://localhost:8090/abc/some_app, this will redirect the page to http://localhost:8090/other_app/LoginServlet?param1=abc

More information and tutorials on: http://www.tuckey.org/urlrewrite/