Posts

Using Spark to load DynamoDB tables with AWS Spark-Dynamodb connector.

Image
Monitoring DynamoDB Capacity Getting a true picture of DynamoDB WCU/RCU capacity is difficult because the default monitors automatically aggregate WCU/RCU metrics by minute. This hides spikes and abstracts away true metrics of WCU/RCU consumption (Cloudwatch also does the same aggregation by minute). In order to get a more accurate picture of these metrics, we decided to use Grafana/Influx stack described in my other post to capture second level metrics for WCU/RCU consumption. Our Use case 200 TB dataset stored in Parquet on s3 Ingest into 20 DynamoDB tables using Spark S3 -> Spark -> DynamoDB using AWS labs emr-dynamodb-hadoop connector To ingest a dataset this large in a reasonable amount of time we need to make sure DynamoDB is using all possible capacity across all tables, so good monitoring is critical. The AWS labs Spark connector emr-dynamodb-hadoop has params which let you configure what percentage of your Dynamodb provisioned capacity should be ...

Using Signoz and OpenTelemetry as an alternative to DataDog

Image
 Datadog is an essential tool for monitoring large applications, but for hobby projects Sigmoz is a great open source alternative that provides similar functionality. It's also free and easy to setup using docker-compose. Start off by cloning the signoz github repo. git clone https://github.com/SigNoz/signoz cd signoz/deploy If you already have docker and docker-compose installed than you can skip this, but otherwise run the install script. ./install.sh Launch the Sigmoz service. The docker-compose setup includes clickhouse database, Zookeeper service, and a sample application called hotrod. This will bring up everything. docker-compose -f docker/clickhouse-setup/docker-compose.yaml up Then navigate to http://localhost:3301/ and you'll be prompted to setup an account and admin password. Then you'll be able to see the Sigmoz homepage with some example metrics. We can then start on integrating the opentelemetry metrics with our java app. Open telemetry automat...

Installing Influxdb and Grafana on EMR

Image
I wanted to install recent versions of Grafana/InfluxDB stack onto a EMR cluster node for publishing metrics from Spark. This will allow me to receive metrics coming directly from the Spark listeners. Used Dockerfile here  for configs and setup.   Find Amazon Linux version cat /etc/issue Amazon Linux AMI release 2018.03 cat /etc/system-release Amazon Linux AMI release 2018.03 Install Version 5.4.2 of Grafana sudo yum install https://dl.grafana.com/oss/release/grafana-5.4.2-1.x86_64.rpm Install Version 1.7.2 on Influx sudo yum install https://repos.influxdata.com/centos/6/x86_64/stable/influxdb-1.7.2.x86_64.rpm Configure Grafana/Influx Move influxdb/config.toml from linked repo to /etc/influxdb/config.toml Move graphana/config.ini from linked repo to /etc/grafana/config.ini Add these lines to grafana config file: type = influxdb host = localhost:8086 name = grafana user = grafana password = grafana Start Influx /usr/bin/influxd ...

Packaging Electron Applications for OSX

Image
I want to use Electron to create installable apps for OSX, Precise, and Trusty. I first attempted to use fpm (package distribution across multiple platforms) which works fine for creating rpm and deb packages, but creating the OSX pkg file proved difficult. These steps will summarize how to create an installable .pkg from an Electron Application. NOTE: This tutorial does not cover creating valid packages for the mac app store, which has a whole different set of requirements. But you will end up with a .pkg file which can be distributed and used to install your application. I. Re-Branding Electron Unless you want your Application to be named "Electron" after its installation, you will probably want to rename the application and change the default icon. On OSX, this is done by editing the existing Info.plist file in the Electron.app directory and changing four values to match your Apps name. The Location of the Electron Info.plist file is found under the Electron.app...

Using Selenium Testing for Electron (Atom shell) Applications

Image
Electron (formerly Atom Shell) is a very new way to quickly create javascript applications for multiple platforms. Not a lot of documentation exists about using Selenium to do full-fledged integration testing on Electron Applications. And even less exists about performing such tests in python. This brief tutorial should answer a few questions about Electron Applications and show you how to get started using Selenium to test them. I. Getting started Install and start chromedriver Selenium need this to be able to make calls to the Electron App. Chromedriver acts as a bridge between Selenium and Chrome, it follows Selenium wire protocol. By default, chromium runs on port 9515, you can start on alternate ports, but remember the assigned port this will be passed as an argument to Selenium later. ./chromedriver --port=9515 Install Selenium You'll need to use Selenium's remotewebdriver to interface with chromedriver. Im using a python virtualenv to keep all my python plugi...

Protostar: Stack5

Image
Overview I wanted to learn more about the fundamentals behind performing a buffer overflow, or "Stack Smashing" attack. This is a pretty common attack and no modern system would really have this type of vulnerability. Still, this is a good way to gain some beginners knowledge of reversing with GDB and the x86 assembly language, and because I am a diehard academic, lets dive in! Overview of Buffer Overflows Following the Exploit exercises site here , we downloaded the Protostar VM. I started at Stack5, because Stack0-4 were relatively simple. Stack5 is recreating a standard buffer overflow using shellcode as a payload. The source code includes nothing except for a simple 64 Byte buffer that you are supposed to overflow. This was my introduction to using any type of shellcode related exploit, so I have included links for my own reference. A snip of the source code for this exercise Crafting the Payload Metasploit and MsfPayload are command-line too...