Installing JupyterLab on a Mac

Today I ran into a minor challenge getting JupyterLab installed on my Mac. I’m running Mac OS Catalina so your mileage may vary. The following are few items I ran into and the steps I followed to get JupyterLab setup and running. I’m sure I am way behind the curve of most InfoSec and IT types who do this kind of stuff in their sleep. But just in case there is someone else out there who does not know how to get started with JupyterLab on their Mac, I present the following:

  • The version of Python baked into Mac OS is old and not supported by pip (I think it’s Python 2.7x).
  • The error message I got when I ran

pip install jupyterlab

  • was the following:

DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020.

  • So installed Python 3.8 from – no problems there.
  • I then ran the following to make sure pip was installed in this instance of Python:

sudo -H python3 -m ensurepip

and the following was returned:

Looking in links: /tmp/tmp4m5ghja

Requirement already satisfied: setuptools in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (41.2.0)

Requirement already satisfied: pip in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (19.2.3)

  • So I knew that I had a functioning pip in my Python 3 environment – so far so good.
  • Last step was to use pip to do the install:

python3 -m pip install jupyterlab

  • I saw a long list of modules/apps downloaded and installed thanks to pip magic:

Collecting jupyterlab

Downloading (6.4MB)

     |████████████████████████████████| 6.4MB 1.0MB/s 

Collecting tornado!=6.0.0,!=6.0.1,!=6.0.2 (from jupyterlab)

  Downloading (482kB)

     |████████████████████████████████| 491kB 53.0MB/s 

(much more followed…)

  • After the install was completed I just ran the following to start up JupyterLab:

jupyter lab

  • My default web browser (Safari) popped up with the JupyterLab environment ready to go (http://localhost:8888/lab). Success!

Cleaning Up My Drive

I recently transitioned to a newer PC for my main computer. It’s considerably faster (and lighter and cooler) than the 5+ year old Dell desktop it’s replacing. But it has a considerably smaller drive. I just did not want to pay a premium for a big SSD. So that means I need to be smart(er) about my storage usage. Sure I can keep everything “in the cloud” and I do. But I am old enough that I still like to keep a copy of my most precious data local on my PC. I can then back up the local copy to any number of external drives and even back that back up to another cloud service like Backblaze. It’s an illness I know.

As I started to set up my precious photo collection on the new PC, I noticed that it was consuming nearly 100 GB of my scarce 256 GB drive. No bueno – as we say in the InfoSec business.

The cure turned out to be sweeping duplicate files from my photo library. I won’t bore you with the details, but let’s just say I’ve been promiscuous in my use of photo apps and services – very promiscuous. Enough so that I know that I have duplicate copies of the same photos stored in various sub-directories on my drive. So I knew that I wanted to discover these dupes and deal with them. The question of course is how would I find them?

There are any number of applications you can download that claim to be the answer to your duplicate file woes. But I have to say that many of the ones I found were hosted on dodgy looking websites and I feared would be crawling with spyware, adware and perhaps even worse bits. So I decided to use my Google-fu to look for any PowerShell scripts that might serve my needs.

And sure enough I found a great resource at a site called “Read Only Maio” This person had already done all the heavy lifting for me. I just needed to apply a minor tweak here or there and create a workflow for myself. If was really very easy.

For each file location I wanted to review, I went through the following process. So for example, I cleaned up my photos by opening up the PowerShell console and changing directory to c:\Users\Kevin\OneDrive\Pictures\ and then ran the following steps:

Stage one:

gci -file -recurse | Group-Object Length | Where-Object { $_.Count -gt 1 } | select -ExpandProperty group | foreach {get-filehash -literalpath $_.fullname} | group -property hash | where { $_.count -gt 1 } | select -ExpandProperty group | select hash, path | Out-File c:\dupe\duplicated_files.txt -width 510

This outputs a text file to c:\dupe\ that will show the detected duplicate files. After reviewing and sanity checking the list I then moved on to Stage two.

Stage two:

gci -file -recurse | Group-Object Length | Where-Object { $_.Count -gt 1 } | select -ExpandProperty group | foreach {get-filehash -literalpath $_.fullname} | group -property hash | where { $_.count -gt 1 } | foreach { $ | select -skip 1 } | select -ExpandProperty path | foreach {Move-Item -LiteralPath $_ -Destination C:\dupe}

Now for each detected duplicate, one file is moved to the c:\dupe directory. Note that if you have more than two of the same file, only one will be moved and you will see error messages in the PowerShell console advising you that a file cannot be created in the c:\dupe folder with the same name. This means that if you have more than two copies of the same file you will need to repeat Stage two multiple times.

Stage three:

Review the files in c:\dupe and spot check if you want. If you are comfortable that these are indeed dupes, you can then empty c:\dupe and you will have freed up some space on your drive.

You can repeat Stage two and three as many times as it takes to eliminate your duplicate files.

Please note, this process does not take into account your preferred location for files. If you want to make sure that you keep the primary copy of the file in a certain location this process may not be right for you. But this worked a treat for me and eliminated thousands of duplicate files that were just wasting space on my drive. Hopefully this can do you some good as well.