Sunday, November 06, 2016

Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th November 2016

Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th November 2016

Big Data - Harisfazillah Jamel - Startup and Developer 4th Meetup 5th November 2016

  1. 1. Big Data Harisfazillah Jamel Startup and Developer 4th Meetup 5th November 2016

  2. 2. Why Big Data? Big Data is not only for big player Big Data is also for Us. Startup and developers Data is raw gold. Information about us is the end product. Data define us. Web Server log, web page analytic and comments about or products.

  3. 3. What Is Big Data? Big data is a term for data sets that are so large or complex that traditional data processing applications are inadequate to deal with them. Challenges include analysis, capture, data curation, search, sharing, storage, transfer, visualization, querying, updating and information privacy. (Wikipedia) Lets redefine big data for us.

  4. 4. What Is Big Data? Volume . Variety . Velocity . Veracity ● Very big data ● Multiple sources ● Stream in data ● Accuracy of the data

  5. 5. Redefine Big Data For Startup 4 important terms :- ● Data Sets ● Data Processing ● Analytic ● Visualization Big Data is big. We need to focus

  6. 6. What Should We Call Our Big Data? ● Small Data ● Startup Data ● No Data We need to visualize our data since day 0 It’s a must

  7. 7. Why Big Data? Big data analytics examines large amounts of data to uncover hidden patterns, correlations and other insights. (SAS) We need to know our own insight. Visualize our future.

  8. 8. Data Sets We don’t have any data (No data) or lack of data - Hendak cari data kita cari data Our own data or We have a place to start.

  9. 9. Data Set : Our Own Data? ● Web server log ○ IP address of the visitors. IP2Country ● Web access analysis ○ Most visited pages ● Comments from our users. ○ Good, bad, Like, Dislike.

  10. 10. Issues With The Data? Lack of useable information. We need to collect data on our own. Ini peluang business untuk startup.

  11. 11. What Need To Be Collected?

  12. 12. Good Bad Like Dislike What we want to know from big data and any data that we analysis is this :- GOOD BAD LIKE DISLIKE Sentiment analysis

  13. 13. When Who Where What Why How When - @timestamp is important for data analysis. Who - Anonymous is important but we need to know male or female and his or her age. Where - Anonymous is important, but we still need the IP address to know from which country or state or county. What - The operating system, the browser's version Why - Keywords thats lead them How - How they know about us

  14. 14. How To Visualize Our Data I’m a fan of ELK Elasticsearch Logstash & Kibana ELK is one of Big Data tools

  15. 15. Index The Data With ES Used Elasticsearch to Index our data. One misconception. ES is not for storage. Don’t used ES to store our data. Data need to be archived elsewhere.

  16. 16. ES Search API The result in JSON. Developer love JSON. (May be) 0/_exploring_your_data.html

  17. 17. Kibana We can use Kibana to view our data in ES.

  18. 18. DKAN We can store data with DKAN. DKAN follow CKAN. The open source open data platform with a full suite of cataloging, publishing and visualization features that allows organizations to easily share data with the public. Take advantage DKAN Datastore API

  19. 19. GeoSpatial Is Important Our data need to have spatial information (GPS Coordinate) We can used GeoServer to have our own Map Server.

  20. 20. The End Q & A [email protected] 019-6085482

Sunday, March 20, 2016

Call For Speakers Malaysia Open Source Conference 2016 MOSC MY

Call For Speakers Malaysia Open Source Conference 2016 MOSC MY

Call for speakers is open to all individual, organization, universities, companies and government agencies who is to present on the case study, development, implementation or applications. The presentation paper or slide must be in knowledge sharing concept. The presentation paper or slide must not contain marketing materials to promote certain product or company.

MOSC MY 2016 :

Call For Speakers form :

Over the years, Malaysia Open Source Conference or MOSC MY have brought together thousands participants, of CEOs and leaders, vendors, consultants, associations and regulators from around Malaysia and the world to address mutual challenges and share information on Open Source Software.

With "A New Begining" as the theme for year 2016, we are addressing the main technology focus and trends for most consumers.

MOSCMY 2016 is set to explore the Open Source software and technology at the Enterprise level, and to promote the development of local Open Source solution for Enterprise environment to be use worldwide.

Date : 25-27 May 2016 (Wednesday Till Friday)
Time : 9am till 5pm
Venue : Faculty of Information Science & Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor.


Tuesday, February 09, 2016

Install Oinkmaster For Suricata IDS / IPS / Network Security Monitoring Software

I'm using Suricata IDS / IPS / Network security monitoring software and log alert into syslog. By logging into syslog its can be process later by Logstash and store into Elasticsearch and can be view by Kibana. I'm using Ubuntu Linux Server 14.04 LTS for this setup.

Simple guide :-

1) Install suricata and oinkmaster

apt-get update
apt-get install suricata oinkmaster

 * suricata disabled, please adjust the configuration to your needs
 * and then set RUN to 'yes' in /etc/default/suricata to enable it.

2) Download rules

2.1) Create directory

mkdir /etc/suricata/rules

edit /etc/oinkmaster.conf using vi or pico and add this line.

url =

Save and run test

oinkmaster -C /etc/oinkmaster.conf -o /etc/suricata/rules

check directory /etc/suricata/rules should all rules download

3) Create a cron file into /etc/cron.d named oinkmaster

pico /etc/cron.d/oinkmaster

Add this content

0 2 * * * /usr/sbin/oinkmaster -C /etc/oinkmaster.conf -o /etc/suricata/rules

4) edit /etc/suricata/suricata-debian.yaml

# Configure the type of alert (and other) logging you would like.

  # a line based alerts log similar to fast.log into syslog
  - syslog:
      enabled: yes
      # reported identity to syslog. If ommited the program name (usually
      # suricata) will be used.
      #identity: "suricata"
      facility: local5
      #level: Info ## possible levels: Emergency, Alert, Critical,
                   ## Error, Warning, Notice, Info, Debug


  - syslog:
      enabled: yes
      facility: local5
      format: "[%i] <%d> -- "

# Set the default rule path here to search for the files.
# if not set, it will look at the current working dir
default-rule-path: /etc/suricata/rules

classification-file: /etc/suricata/rules/classification.config
reference-config-file: /etc/suricata/rules/reference.config

Further reading :-

Today Notes Blog About Linux Open Source Computer Internet

comments powered by Disqus

Popular Posts