Riston

Docker Reference Sheet

October 20, 2023 by Riston Leave a Comment

Introduction

Docker is a container management technology.
Containers are discreet units of software.

AWS EC2 Reference

August 10, 2023 by Riston Leave a Comment

Quickstart

From the AWS Console, navigate to EC2. Then you will go to Instances, and select Launch Instances. From the “Launch an Instance” screen, you will perform the following actions:

Name the instance
Select the Amazon Machine Image/OS (Default is AWS Linux)
Select the Instance Type (t2.micro is the default general EC2 type and is free tier eligible)
Generate a new ssh key for remote access to server instance
Configure Network Settings (set ssh access origins, http and https connections)
Configure Storage (Sets the amount and type of Storage available in the instance)
Advanced – (contains a field User Data which provisions a boot script)
Launch Instance

EC2 Types

Naming Convention

<instance_type><generation/version>.<size>
(t2.micro)

General Purpose

Balanced between computing, network, and memory
Good for most applications

Compute Optimized

Machine Learning
Batch Processing
Media Encoding/Transcoding
Gaming Servers

Memory Optimized

Databases
Cache Stores
Real-time data processing

Storage Optimized

Databases and Warehousing
Distributed File Systems
High frequency online transaction processing (OLTP) systems

Security Groups

Operate as a firewall to specific EC2 instances, and many can be attached to a single instance. Both inbound and outbound rules can be configured based on port and ip address.

Connection Strategies

EC2 Instance Connect – Browser-based ssh connection (Security Group must allow connection)
SSH – Must have .pem file and security groups configured.

Purchasing Options

On-Demand – Pay for what you use, billing per second after one minute – short term workloads
Reserved (1 – 3 yrs) – Up to 72% less than on-demand, but requires a long term subscription.
Savings (1 – 3 yrs) – Similar to above, commit to specific instance type
Spot Instances – Up to 90% less than on demand, but can lose the instance if max amount is less than the current rate. These are great for failure resistant tasks like batch processing and data analysis. while (current spot price < max slot price)
Spot Fleets – Set of spot instances and optional on-demand instances
Dedicated Hosts – a physical server dedicated to your use; most expensive option.
Dedicated Instances – runs on hardware dedicated to you, no control over placement.
Capacity Reservations – reserved on demand instances you can access at any time. should be used with savings or reserved plans to maximize cost benefit.

Placement Groups

Cluster – clusters instances into low latency group in AZ
Spread – across multiple underlying hardware (MAX 7 instances)
Partition – across partitions (racks) in AZ – scales to 100s of instances.

Elastic Network Interface (ENI)

A virtual network card used with vpcs which enable EC2 instances to access the network.

Can have 1 public IPv4 plus 1 or more private IPv4s
One Elastic IP per private IPv4
One or more security groups
One Mac address
Can be created independently of EC2 instance

https://aws.amazon.com/blogs/aws/new-elastic-network-interfaces-in-the-virtual-private-cloud/

Storage Options

EBS – Elastic Block Store

Overview

EBS Volumes attach a network drive to an EC2 instance
Allows instances to persist data, even after termination
Can only be mounted one instance at a time (CCP Level)
Bound to specific AZs
Like a “Network USB stick”
Free tier offers 30 GB of data storage in SSD or Magnetic per rmonth

EBS Snapshots

Create backups
Archives – 75% cheaper, but can take up to three days to restore from
Recycle Bin – Set period to retain deleted snapshots so that they can be restored (1day to 1 year)
(FSR) Fast Snapshot Restore – expensive, but no latency on first use

Volume Types

gp2/gp3 – SSD Store balancing price and performance.
io1/io2 – highest performance SSD volumes. Critical low-latency/high throughput.
- Provisioned IOPS (I/O Ops / Sec) – Critical Business applications, needs more than 16,000
- 4 GiB – 16TiB – io1/io2
- 4 GiB – 64 TiB – io2 Block Express (256,000 IOPS)
- Multi Attach to multiple (MAX 16) EC2 instances
st1 – Low cost HDD for high frequency, intense throughput workloads.
- Can’t be a boot drive
- Throughput Optimized (Data Warehousing)
sc 1 – Lowest cost HDD volume for less frequently accessed workloads
- Can’t be a boot drive
- Archived Data

Encryption

All data stored and moving between instance and volume is encrypted
Minimal impact on latency
Leverage keys from KMS/AES-256
Handled transparently

To encrypt an unencrypted volume:

Create EBS snapshot of volume
Encrypt snapshot using copy
Create new Volume from the snapshot
Attach new encrypted volume to instance

EFS – Elastic File System

Managed NFS (Network File System)
Can work in EC2 instances spanning multiple AZs
Expensive, but highly available and scalable (3 x gp2)
Useful for CMS, web serving, data sharing
NSFv4.1
Only compatible with the Linux AMIs
Performance Modes
- General Purpose – CMS, WordPress
- Max I/O – Media Processing – high latency, throughput, and parallel processing
Throughput Mode
- Bursting – 1TB – 50MiB plus bursts up to 100MiB/s
- Provisioned – set regardless of storage size
- Elastic – scale based on workload
Tiers
- Standard
- Infrequently Accessed (EFS-IA) – requires lifecycle policy
Availability
- Standard – Prod, Multi-AZ
- One Zone – Dev, Single AZ

Create a Custom Edit Template Link

May 28, 2023 by Riston Leave a Comment

Continuing my recent dive into customizing and adding functionality to my series of custom taxonomies, I realized that I need to be able to access the edit page for editing these specific archive pages. WordPress has a built-in function for accessing links to the post page, edit_post_link, but this does not work with taxonomy pages.

WordPress didn’t initially implement taxonomies to operate as posts, and while I’m not using taxonomies necessarily as posts, I do include a great deal of data with my custom taxonomies that requires updates. Here is a basic function I wrote that builds a link for accessing the taxonomy’s edit page:

// Accepts the relevant WP Term Object as the sole parameter
function edit_template_link($term){
    if(!current_user_can('administrator')){
        return;
    }
 
    // Define the three variables required to build the link
    $taxonomy = $term->taxonomy;
    $id = $term->term_id;
    $root = get_bloginfo('url');

    // Build the link
    $link = $root."/wp-admin/term.php?taxonomy=".$taxonomy."&tag_ID=".$id."";

    // Build and return the HTML element
    $html = '<div class="edit-term-link">';
    $html .= '<a href="'.$link.'" >Edit</a>';
    $html .= '</div>';
    
    return $html;

}

How To Create A Bidirectional Taxonomy Query

May 24, 2023 by Riston Leave a Comment

One of the great benefits of the recent versions of Advanced Custom Fields is the versatility by which users can effortlessly generate custom post types and taxonomies. I have created a series of inter-related custom taxonomies in an effort to organize my thoughts and to create an ontological linking system throughout the site. Sort of a custom tool for cataloguing my varied interest.

Two of these custom taxonomies are grammatical_terms and morphemes. It’s important in any discourse to define one’s terms, and breaking down terms into their components (morphemes) and ruminating their origins (etymology) can be an incredibly useful tool for assisting one in properly understanding and communicating definitions.

Using ACF, I am able to easily add morphemes to grammatical_terms and display them on their respective archive pages, but what if you would like to also display associated terms on the morpheme’s archive page? It would be tedious to have to add this manually on both your term and its respective morphemes.

WordPress has core functions that are useful for performing these types of operations across posts and post-associated taxonomies, but what if you are making custom taxonomy to custom taxonomy associations? This requires a bit of coding:

function get_associated_terms($t_type, $meta_options){
    global $wpdb;
 
    $query = 'SELECT term_id FROM `wp9e_termmeta` WHERE meta_key LIKE "'.$meta_options['key'].'" AND meta_value LIKE "%%%'.$meta_options['value'].'%%%"';

    $term_ids = $wpdb->get_results( $query );

    $terms = array();

    foreach($term_ids as $id) {
        $term = get_term_by('term_taxonomy_id', $id);

        if($term->taxonomy == $t_type){
            array_push($terms, $term);
        }
    }

    return $terms;  

}

We need to create a function that builds a simple SQL query, get_associated_terms. Since I’d like to reuse this function for other taxonomies, it accepts two parameters, the taxonomy type and an options array. The options array contains a key, which is the custom fields field name, and a value, which is the id of the taxonomy being queried.

To understand how this is used, in my morphemes.php template I’d like to query the terms that have been associated with that morpheme. I have added morphemes to the terms, but not terms to morphemes; therefore I build the parameters for the term query like this:

    $term_options = array(
        'key' => 'morphemes',
        'value' => get_queried_object()->term_id
    );
    
    $terms = get_associated_terms("grammatical_term", $term_options);

This will query rows with the meta_key morphemes whose values (which are a serialized arrays) contain the id of that morpheme. We can then filter out the associated term ids that have the custom taxonomy type grammatical_term by iterating through the returned ids and retrieving the data for these terms: get_term_by('term_taxonomy_id', $id); It’s necessary to query by term_taxonomy_id if you want to get consistent data. Querying by id or term_id doesn’t return useful data (generally returns the correct keys but the with empty values).

The code returns the associated grammatical_terms successfully:

I hope this solution can provide utility with building up more complex inter-taxonomic queries and relationships. If you’d like to divest yourself of the hassle of coding, or would like to schedule a consultation, please check out my site https://futurelithics.com.

A Journey in Product Development: Part 3 – Creating a Clickable Prototype

January 11, 2021 by Riston Leave a Comment

In Brief…

This is the culmination of the design stage before investing in any solid development decisions. Gathering user feedback from this artifact will be essential for finalizing and assembling a full-scope requirements document and technical design decisions such as infrastructure and stack. If the idea is terrible and not well received, at least the development cycle can be avoided before hiring in additional team members. In this case, I’ll likely be the sole contributor and will work to complete this project whether its well received or not, but in a more business-centered context this could be a important stage in whether the product lifecycle will progress or be stalled out.

Moodboards and bringing the app to life.

Since the moodboard for the design contains copyrighted material, I will not post a screen shot of it here, but you are welcome to check out my Pinterest board for this project. I feel that the design should have a look and feel appropriate for a music-centric app with “data-ist” aesthetics. A “dark mode” is almost always a great choice for visualizations outside of purely business-analytics oriented interfaces, and this project should emulate that style while using a vibrant color theme for actionable assets and visualizations. The moodboard will also continue to provide inspiration for developing the final visualization elements for the application.

General Flow

LogIn/Register Sequence

A fairly basic and generalized flow for this function should be sufficient. The only consideration is in later stages of the app’s development where we may wish to get the user to enter more personalized data for display in the online forum.

For the Login process, the app begins with a loading screen (currently containing a placeholder for the logo), then follows a sequence that can be tested in the prototype. I may in the finalized design add a password set/reset screen, but that is a decision that can wait.

Visualizations

For the visualization feature of the app, I created a few placeholder visuals using Illustrator. There is also the button for selecting these visuals in the bottom toolbar. There are not only visuals for streaming data, but also some basic “waveform/spectral” visualizations.

Radial Visual (Generic but works as a placeholder)

Linking things up

Adobe XD features fairly robust options for working quickly generating clickable prototypes and publishing them for review. It is quite fascinating to see all the connection widgets graphically displayed, and its even more interesting to see the proposed architecture while in clickable prototype mode.

In Action…

A basic prototype is now complete, and can be viewed live here.

A Journey in Product Development – Part 2 : Decisions and Wireframes

January 9, 2021 by Riston Leave a Comment

Decisions

Noticing that there was little need for distinct and separate screens for sampling, visualization mode, and the home screen, I decided to combine these screens into one. the different modes and functions could be determined by buttons, menus, and widgets within the UI to facilitate interaction. The distinct screens now involve the main screen, the sample breakdown screens, the login/registration flow, and the online forum and repository.

While I have only made a few design decisions thus far, I have been able to break down the concept of the UI and the basic user-interaction flow more efficiently. The basic idea for the screens are laid out, and the basic interaction flow has been detailed using red annotations. All of the assets represented were developed personally using Adobe Illustrator.

Wireframe

A Journey in Product Development – Part 1 : Ideation and Content Mapping

January 6, 2021 by Riston Leave a Comment

Introducing Spectrafact

Intelligent acoustical investigation app for sound identification and visualization.

Description

Spectrifact harnesses the power of Machine Learning and predictive analytics to isolate and identify prominent environmental sounds. Each sound can then be broken down into their principle waveforms and analyzed for their spectral properties. The user can then choose multiple ways of visualizing the output data.

Functions

Identification of specific sounds (mostly instruments, but other sound sources as well)
Offer approximated waveform/wavelet properties that can be used for reconstruction via synthesis.
Acoustic analysis for live sound mixing or basic acoustical investigation during field recording.
Build network and repository of shared samples and associated data.

Target Audience

Professional and Amateur Sound designers.
Acoustic and Audio engineers.
Musicians and Performers.
Hobbyists and Scientists interested in learning more about sound.

Since this application would appeal to musicians and a slightly more technical audience, a slightly more sophisticated design is appropriate in this case. A whimsical or more light-hearted design concept would likely be off-putting and not work. The app’s design system should, however, feature interesting and somewhat “techie” design choices. A dark theme would likely be most appropriate since a key feature of the app is visualization. This will be fleshed out further in Part 3 of this series, featuring the clickable prototype built in XD.

Process

The main functionality of this app can be broken down to three primary user-flows. Firstly, the app is intended to allow users to visualize sound data that is either real-time or recorded. Secondly, the user should be able to record a sample, and then have the system intelligently classify the most likely sounds within the acoustical environment. Thirdly, the user should be able to connect with other app users and share samples via an online repository.

Content Map for Sectrafact app user flow.

Mock Album Covers – Vulpes

July 28, 2020 by Riston Leave a Comment

Vulpes, the Fox-themed band.

“What the Hell?” is likely the first question on your mind.

In brief, I’ve got a bit of time on my hands and have been up-skilling a bit through Coursera. One of the areas I’m branching a bit more into is Graphic Design, specifically beefing up my Adobe Creative Cloud skills. For the specific course in question, the guideline suggests to choose a particular subject (likely an animal) that you will use throughout the course. Having loved foxes since being a kid, there was little thought required on what my subject should be.

Fast-forward —>

One of the optional assignments suggested creating a series of images. I was soon struck with inspiration: a series of mock album covers! The “band” is Vulpes (no relation to the early 80’s Lisbon-based punk band), an entirely fox-themed series of albums! I’ve shared it, despite not by any means being “professional” work, because I’m sure that there are a few folks out there that might actually enjoy the humor in these. Anyway I had a lot of fun making them, and with experimenting with creating textures in Photoshop!

Vulpes:

Bob’s Awesome Music Venue in Fargo, ND!

June 26, 2020 by Riston Leave a Comment

Where to open Bob’s awesome underground alternative music venue.

Machine Learning Capstone Project for
IBM’s Data Science Professional Certificate

1. Description of the problem

Bob Smith has recently come into a modest sum of money, and would like to fulfill his dream of opening a mid-sized music venue where he can book both local and larger performance artists, as well as providing a safe and interesting hangout for not only himself but people of all adult groups. While he has the money to invest, he still needs to be prudent in his use of the funds so he still has some limitations financially. He also wishes to open specifically within the city of Fargo, North Dakota (For what reason only the gods may speculate).

1. Overhead/Rent – He needs to find a large enough space to host events and that would be suitable to house a small kitchen, a bar, a barista bar, and a seating section. He needs space while minimizing rent overhead.

2. Crime – He needs to be somewhere that economizes rent, but where violent crime is minimized, in order to cut down on venue security and provide a safer environment for his patrons.

3. Accessibility To Desired Demographic – Since it will be a music venue, his venue will likely need to be at least accessible to younger college-aged crowds who may not have reliable transportation. Being a music venue, having access to hotels might be desirable, and also being located in relative proximity to nearby places of congruous interests may also be valuable.

2. Background, Data, and Approach

Pricing:

There are a total of 38 neighborhoods within the city of Fargo itself (defined from Zillow’s OpenDataSoft), and median property value (ZHVI) can be also pulled in csv format from their site at https://www.zillow.com/research/data/ . While home value does not equal commercial property, it can be used to make general assumptions regarding relative costs likely associated.

Crime:
While there are crime stats available for consumption, I thought it might be interesting to use a keyword search from Google and to scrape sites indexed there in order to create a crime index. Using Python’s request module to fetch and to Beautiful Soup to parse content from open sites, I compare a selection of keywords associated with violent crime to count the number articles that reference both crime and the neighborhood in question. (This is actually my first attempt at a Crawler/Scraper function, despite coding for quite a few years now).

Categories:

The category index is derived from the FourSquare API’s category attribute, and a list of unique venue categories is generated. From that list, a weighted list is manually generated based on the types of venues that would be indicative of a good area to open shop. Iterating through the venues, an index is created based on the number of most relevant venues within a given neighborhood.

3. Methodology and Exploratory Process:

Neighborhoods and Median Home Value:

For this data, I used Zillow/OpenDataSoft resources. Since there was no readily available neighborhood data otherwise, I had to parse out neighborhood names along with their geo coordinates (both 2d center points and geometrical shape boundaries). The geo data from the neighborhood dataset had to be formatted into a consumable geoJSON structure that could be digested by Folium in order to properly generate the neighborhood boundaries, completed immediately after merging the data from the ZHVI into the initial data frame:

And together this data was enough to generate the following choropleth map:

Crime Reports:

As mentioned, I thought it might be interesting to derive a basic “violent crime index” by running a hackneyed crawler/scraper, relying heavily on the Google Search Console and the the Beautiful Soup module. I set a timer to create a constraint against allowing the program to lag out due to slow servers and other issues that might arrive (disclaimer: I only performed a limited set of calls for this, within the bounds of the free number of Google API calls permitted, which is fairly limited, since I did not wish to adversely affect anyone’s site). I then parsed out page content for a small list of keywords that are directly associated with violent crime to create an index.

This approach, while not very scientific or practical, nonetheless was an interesting experiment. I could use this data to create another choropleth map, this time coding “crime” instead of “median_value”:

Venue Category Listings:

Without knowing anything about Fargo, ND, I had to rely entirely a cursory google search and the data/api’s listed above to work out my approach. The foursquare API was invaluable in knowing what unique categories of venues exist in Fargo. After exploring crime-related news and median house values, it was necessary to begin exploring the neighborhoods in order to derive a list of unique venue categories.

From this list, I was able to select a sublist by hand containing the categories most relevant to the type of venue I’d like to open. I then assigned weights to each of the values, and could then perform a weighted assessment of the relevance of venues within a given neighborhood.

This allowed me to generate another choropleth map, to which I also was able to append the list of categories to neighborhood:

The category list was way to long with un-useful categories, So I pared it down by intersecting the neighborhood category list with the list of desirable categories.

K-Means Clustering:

For statistical analysis, a clustering technique appeared to be most appropriate in order to provide some level of neighborhood segmentation based upon the available data. Given the wildly different magnitudes of data domains (the indexes being low while the median_values were relatively massive by comparison), some preprocessing and standardization of the data was necessary to prevent one variable from completely dominating the others.

I then ran a distortion test in order to determine the ideal number of clusters for this model:

This seemed to indicate that between 4 and 6 clusters would be ideal, so I chose conservatively and went with 4.

4. Results

The resulting clusters derived from K-Means seemed to segment in such a manner that three of the neighborhood clusters were segmented by one of the independent variables, while one cluster was sort of a mediocre hodgepodge of neighborhoods. Only one neighborhood was in the cluster centered on the Category Index variable, and that was the Downtown neighborhood.

The distribution of the values aggregated vs. the category index *please note the category 3 point for downtown is obscured by its group’s corresponding centroid :

5. Discussion

Overview:

Based upon the data, criteria, and analytic results discussed above, the Downtown neighborhood is likely the best neighborhood the open an alternative music venue. While the area does have some crime, the rent is likely cheap and it is proximal to a wide variety of interesting venues, allowing some level of demographic cross-pollination.

Runners-up would be anything from clusters 1 (the Hodgepodge cluster), and cluster 2 (High Crime Index). Cluster 0 (High Property Value) would likely feature very high rent, and little access to other venues of interest:

Personal Observations:

The crime metric was probably the least dependable data available, mostly due to the collection process. I think a news/keyword analysis technique might have some application though, perhaps in providing a low-weight augmentation for a more dependable metric.

The category index approach I think could use refinement, but might be a useful overall to have a way to quantify subjective values in decision-making. Building a list of similar venues, or ones that might be congruous or complimentary to the venue being proposed, and then weighting those categories provides a useful way to ensure the neighborhood you are choosing is likely a good fit. While having a hotel or coffee shop nearby would be positive, its not always indicative of a good location for the type of demographic you may be catering to: a skate-park, gaming cafe, and brewery might be better indicators.

For the clustering results, I should have likely chosen 5 clusters, since that might have split Cluster 1 more effectively, making it less of a hodgepodge cluster while distinguishing a definitive runner-up cluster for a good location. In from personal exploration into the data before clustering, I would have given Roosevelt/NDSU and West Acres votes for 2nd and 3rd choice respectively, since both feature good category values as well as likely have decent rent (Roosevelt/NDSU is second place because it is directly adjacent to Downtown, not rent-wise):

The relevant notebook can be viewed here: Visualize_FARGO_ND

Elements of Color

June 25, 2019 by Riston Leave a Comment

Selecting Graceful Website Color Schemes

Image: Aida KHubaeva

If you already have an idea of what colors you’d like to use, or have already set color associations for your brand, then you are already a bit ahead of this section. If you have not quite determined your colors, then here is a nice infographic to help you get started:

Image by Author

Color is a principle that can be easily overdone, so it is best to simplify the color scheme of an application to one or two colors, not including varying shades of the selected colors or neutral colors such as white, black, and grey. It is very easy to cheapen the look and feel of a website by adding too many colors, with few exceptions. It is best to keep color themes simple and neutral, remembering that usually the goal is to provide a relatively transparent user interface with only minimal distractions.