John Daciuk's Portfolio

Me

John Daciuk is currently earning a M.S. in Computer Science from Columbia University and is expected to graduate December 2020. Before joining Columbia, John studied Computer Science at the Harvard Extension School. John also holds a Masters in Teaching and founded a tutoring business through which he worked with over 800 students for the SAT, ACT, A.P. Calculus and Physics.

Projects in Computer Science

Entropy Encoders Implementation (2020)

Implemented language models and encoders for lossless compression of natural language text documents. Embedded within is a presentation and analysis of core ideas in adaptive entropy encoding, including Arithmetic and Huffman encoding. See the project here.

Cloud Computing Final Project (2020)

Worked in a team of three to create a social gaming app where players can meet, chat and play games. Leveraged AWS services to provide a rich and interactive experience for users. See my team's video demo of the features here.

Environmental Impact of Fossil Fuel Extraction Sites (2020)

Worked with another student to implement a database and frontend tool for exploring the impact of extraction sites on the natural environment; won best project award in my databases course. The frontend takes advantage of the Google Maps API to allow users to visualize the entities in the DB and note correlations between them. Our DB contains over 700,000 extraction sites in addition to information about air quality for all US zips, species diversity in all US national parks and metal content in thousands of water bodies. Users can also interact with the site by leaving reviews for the various entities. See a demo here.

Synthesis of Recent Advances in Massively Parallel Computation (2020)

Spearheaded a team of three to write a technical paper threading together recent works devoted to MPC algorithms for geometric graph and DP problems. We give an overview of modern parallel algorithm design, with special attention to applications of Massive Parallel Computation. We first highlight broad ideas by outlining how well known problems can be efficiently solved with MPC. After presenting algorithms for sorting and maximal matching, we analyze the Minimum Spanning Tree problem in Euclidean space from [And+14], and synthesize the progress in [IMS17] for the Weighted Interval Selection problem. Read the paper here.

Kaggle Covid-19 Competition (2020)

Cleaned and augmented x-ray image data of various qualities and resolutions; trained a CNN in Keras to diagnose between 4 classes based on patients’ chest x-rays; achieved over 80% weighted accuracy with fewer than 1200 training samples. The success of the model rests on skilled preprocessing techniques, a weighted loss function aligned to the final metric, grid-search to optimize hyperparameters and the power of CNNs to classify images. See here for detailed methodology and results.

Landscape Painting with GANs (2019)

Scraped the web for 20,000 paintings and used Google cloud GPUs to train DCGANs; created thousands of beautiful and novel paintings in 128x128 resolution. Learn more about the data collection and training here; see generated paintings and analysis of trained GANs here.

Reinforcement Learning Project (2019)

Researched and implemented theory to train linear models from scratch and neural nets with TensorFlow for OpenAI Gym control environments. See the jupyter notebook here.

COMS 4444 Projects (2019)

COMS 4444 (Programming and Problem Solving) is a course I took at columbia which emphasizes collaboration on difficult open-ended problems to be solved through programming. For the project 'Lunch' (see our paper) my team built an autonomous agent capable of coordinating strategy with the other teams' agents. For the project 'Mutation', my team used theory from causal inference to build a model capable of predicting underlying mutation rules from many observations of digital experiments (read our mathematical analysis). For the project 'Threeland' my team analyzed the robustness of different systems of voting and representation to gerrymandering. We tessellated a fictitious country of 300,000+ voters with Voronoi districts and simulated results using SciPy and Monte-Carlo techniques (see a detailed exposition of our techniques). The project 'Flip', similar to 'Lunch', emphasized building an agent that could strategize dynamically. We had a class tournament to simulate hundreds of 1 on 1 matches between the different teams and my team came in 2nd place out of 8 (see our report).

Pente for iOS (2018)

Developed a grid-based game with Swift and principled MVC design that adapts to any device size; leveraged Google Firebase to allow online play; used custom view classes and protocols. See a video demo.

Comparative Analysis of SGD Optimization Variants (2018)

Produced insightful animations with NumPy and matplotlib to accentuate convergence differences between AdaDelta, Adam and AdaGrad algorithms. See the jupyter notebook Tested on popular analytic loss functions such as Bukin and Camel, as well as least squares regression and neural network models.

Playlist Recommendation with Spotify’s Million Playlist Dataset (2018)

Utilized Python, Pandas, scikit-learn and random forests to build a recommendation system able to generate playlists; evaluated with r-precision. See my team's work here