HonestRepair

View this newsletter online

Thanks for subscribing to the HonestRepair Newsletter!

Wow, what a month. We've experienced a full range of exciting events, including server failure, the release of HRConvert2, and the roll-out of Quick Convert just to name a few.

Whats New?

HRCloud2 (Github) - Since our last newsletter we've added a minor feature, fixed some bugs, dropped a dependency, and improved the installation process.

The feature we added was a "Maintanence Mode" to the config file. Users with older config.php files can simply copy/paste the " $EnableMaintenanceMode = '0'; " variable from the configLatest.php file after an auto-update. Changing the variable to '1' will enable maintanence mode and safely disable HRCloud2.

After running some tests it became appearant that Ffmpeg is far more reliable for video conversion through HRCloud2 than Handbrake. Since Ffmpeg is already a dependency we decided to use it for video conversions instead. The benefit to that is the server no longer required Handbrake to perform video conversions!

We also found some old code that was adding 2.5 seconds to every file download operation. Yikes! 

The Pell app also got some rework to the way it's temp files are managed. 

HRConvert2 (Github) - This project was released on June 9th and has already had a few updates and one issue opened.

The issue relates to handling files with spaces in the filename. We experienced this pain with HRCloud2 in the past so it's probably going to take a stroll through the commit history of that codebase to find a fix for HRConvert2. Still, getting this project up-and-running feels good.

Atoner (first-person shooter game) - Nothing to report from last time. 

HoinestRepair Network - HonestRepairServer1 died this past month. In spectacular fashion, no less. Yeah, so that happened.

It started out with routine maintanence. What was supposed to be a straightforward cooler swap turned into a new motherboard and CPU! We shut the server down and replaced the cooler, but the server wouldn't start afterwards. We finally got the system to boot after installing a new Asus motherboard and AMD CPU.

When we finally got it to boot we realized that we had storage troubles as well! One of the drives that stores user supplied Cloud data had failed and needed to be replaced.

That was an expensive week for HonestRepair, but we're still here! Not only that but we just opened up our Quick Convert online file converter to the masses. Based on HRConvert2, Quick Convert can convert 59x file formats and drag-and-drop file uploads. It's free, secure, and privacy-centered. Give it a shot and let us know on Github if you run into any issues.  

Other Stuff - PSExec by Microsoft is a powerful thing. I've been experimenting with wrapping it with simple front-ends to make a desktop Executor app similar to the Executor app for HRCloud2 and it's actually pretty trivial. One could make their own PSExec Remote Admin panel with pre-scripted commands fairly easily. If you're into that sort of thing I highly suggest you read up on it.

In The News

Intel in the news again for more side-channel vulnerabilities. This one is called "lazy FP state" (CVE-2018-3665) and it's far more complex than the sweeping Spectre and Meltdown flaws found earlier this year. According to Red Hat, some versions of their Enterprise Linux O/S aren't vulnerable, as the O/S doesn't use the vulnerable CPU features in some models. Still, this is yet another example of the intense cat-and-mouse game that taking place within the info-sec community, and a stinging reminder that our insistence to-date on high performance hardware over high security hardware has had unforseen tradeoffs.

Source: https://techreport.com/news/33816/intel-lazy-fp-state-restore-vunerability-could-expose-privileged-data

Blender and YouTube are having a very public breakup, with YouTube cancelling the official Blender Channel and removing their entire library of 3D modelling videos. Blender, a cross-platform and open-source 3D modelling application. The cause of all the drama? Advertising revenue. YouTube has been insisting that Blender monitize their videos, a demand which Blender has refused. Instead of caving to the demands of big red, Blender announced that they would continue their channel on PeerTube, an open-source federated video service that outsources hosting to users of the service using the bittorrent protocol. 
Source: http://releases.ubuntu.com/18.04/

Microsoft & Flash Zero Days were uncovered and patched since our last issue. Microsoft actually had quite a few, one of which involves the way Javascript is handled and another having to do with a critical DNS flaw in Windows. The Flash zero days were numerous. But perhaps the best one...

Cortana, please hack my computer. That's right, Cortana can help an attacker with physical access gain access to a locked device. 

Automating Things

In the last issue we said we would create a script to bring a disorganized media library under control. Instead of going through 2TB of movies and TV shows and manually cleaning filenames, managing duplicates, and combining folders we can create a script to reliably do the same thing! As a free bonus we get to practice our skills in Python. Sure you could do this in Powershell, VBS, bash, or Perl as well but lets excercise our brains a bit.

Before we start it's a good idea to jot down ideas for our script. What kinds of inputs will it see? What outputs do we want to get out of it? How do we want to pass arguments to it? What are some of the steps our files must go through during their transformation from ugly to pristene? What edge cases could possibly interfere with those steps?

We know it's going to see filenames that contain a lot of repeating strings. Like "BRRip" and "x264" and "[eng]". I made a list of all the things I can see that have to go just from scrolling through the mess. I'm going to need to evaluate a messy folder name full of messy filenames and try to create a clean folder full of clean files. Given my propensity for typos, I don't want to pass arguments to this thing from the command line. (One time I chown'd an entire /var directory and roached a 'nix box less than 2 hours after getting it configured).One time I chown'd an entire /var directory and roached a 'nix box less than 2 hours after getting it configured). Edge cases surrounding the file extensions and duplicates could complicate things.

So now that we have an organized idea of what we want to accomplish, lets rough out some pseudocode! Roughing out a design with pseudocode makes prototyping a lot easier and ordered. In the case of this particular project I even used many of my lines of pseudo as code comments in the final product! 

My pseudocode looks like this.....

Check to see if an input folder was supplied.

Verify that the target folder exists and is writable.

Create an array of subfolders of the parent folder. 

For each subfolder...

  Clean the folder name and create a new folder. 

  Replace dots with spaces.

  Replace ( and ) with [ and ].

  Replace commonly found torrent group names.

  For each file...  

    Copy each file into the new folder with proper names. 

    Include artwork and subtitles.

    Replace dots with spaces.

    Replace ( and ) with [ and ].

    Replace commonly found torrent group names.  

  Delete the improperly named file.

  Delete the improperly named folder.

That's a pretty good start. Looks like we need to program two loops that manipulate the hell out of a couple of strings, and then use the strings to create folders and move files around. Each time we do something risky with a file we'll put a check in the way to make sure things go smoothly. 

The tricky part here is removing the unwanted substrings from the folder name and file names. We want to remove a lot of things as quickly as possible. Assuming we need to remove 4 substrings per string and there are 4 strings per folder with 1,000 folders this works out to 16,000 iterations if we were to use a loop to scan for each removal. Even then, what if we want to remove 'abc' and 'bcd' from the string 'abcde'? Using a regular loop this would produce a string 'de' when we really expect 'e' as output. 

What we need is a function that we can call to do this work for us using regular expressions. Because we're using Python, we can use dicts (short for dictionary) to store key:value pairs for our replacements. This has the added benefit of being able to set specific replacements to use for specific substrings. In our case we're going to use this functionality to replace all periods from the filename with spaces while most other substrings get replaced with no space. 

Luckily I found bgusach's Github Gist (located here) that does mostly what we want. We're going to hard-code the second argument and actually use two copies of it, one crafted to clean our folder names and the other tailored for cleaning our filenames.

Here's the folder cleaning function.....

# A function to prepare the folder name using our filters.

def cleanDir(string):

  # Define the dictionary of {'matches': 'replacements'}

  replacements = {'()': '', '[': '', ']': '', '{': '', '}': '', '.': ' ', '(': '', ')': '', 'BrRip': '', 'BRRip': '', 'XviD': '', 'BluRay':'', 'YIFY': '', '[YTS.AG]': '', '[YTS.PE]': '', 'HDTS': '', '720p': '', 'x264': '', 'AC3': '', '-': '', '1080p': '', ',': ''}

  # Place longer ones first to keep shorter substrings from matching where the longer ones should take place

  # For instance given the replacements {'ab': 'AB', 'abc': 'ABC'} against the string 'hey abc', it should produce

  # 'hey ABC' and not 'hey ABc'

  substrs = sorted(replacements, key=len, reverse=True)

  # Create a big OR regex that matches any of the substrings to replace

  regexp = re.compile('|'.join(map(re.escape, substrs)))

  # For each match, look up the new string in the replacements

 

  return regexp.sub(lambda match: replacements[match.group(0)], string)

We need to have two functions because filenames have extensions which need to stay exempt from cleaning. Remember that we're replacing periods with whitespace, so to avoid losing the extension during processing we're going to chop it off before cleaning and add it back on afterwards. We implement this crudely by simply counting 4 characters from the end of the input string, which means we still corrupt .torrent files, but I don't care. If you do care it would be wise to add your own handler for that somehow. It could also be better if we used a second argument inside a single function to switch between folder cleaning and file cleaning, but I really only need to run this script once and my media library should be good to go. For what it is, doubling the programming time to improve program efficiency by 10% just doesn't make sense. We're trying to do this as easily as possible.

Here's the file cleaning function.....

# A function to prepare the filename using our filters.

def cleanFile(string):

  # Separate the last 4 characters (the file extension in the case of media, images, and subtitles which are all we need).

  stringExt = string[-4:]

  string = string[:-4]

  # Define the dictionary of {'matches': 'replacements'}

  replacements = {'()': '', '[': '', ']': '', '{': '', '}': '', '(': '', ')': '', 'BrRip': '', 'BRRip': '', 'XviD': '', 'BluRay':'', 'YIFY': '', '[YTS.AG]': '', '[YTS.PE]': '', 'HDTS': '', '720p': '', 'x264': '', 'AC3': '', '-': '', '1080p': '', '.': ' '}

  # Place longer ones first to keep shorter substrings from matching where the longer ones should take place

  # For instance given the replacements {'ab': 'AB', 'abc': 'ABC'} against the string 'hey abc', it should produce

  # 'hey ABC' and not 'hey ABc'.

  substrs = sorted(replacements, key=len, reverse=True)

  # Create a big OR regex that matches any of the substrings to replace.

  regexp = re.compile('|'.join(map(re.escape, substrs)))

  # For each match, look up the new string in the replacements and re-add the extension.

  return regexp.sub(lambda match: replacements[match.group(0)], string).rstrip(' ') + stringExt

# Verify that the target folder exists and is writable.

if os.access(inputDir, os.W_OK) is not True:

  print("inputDir not writable!")

  sys.exit()

And believe it or not that was the hard part! The easy part is creating an array full of subdirectories and then iterating through it, scrubbing all the strings. The remaining logic is fully commented, which pretty much explains removing the remaining whitespace, checking for duplicates, creating the folders, copying the files, and deleting the originals.....

# For each folder...

for dir in os.walk(inputDir):

  # Apply our filters to the input folder.

  oldDir = dir[0]

  newDir = cleanDir(oldDir)

  newDir = newDir.replace("   ", " ")

  newDir = newDir.replace("  ", " ")

  # Check if a folder already exists and create one if it does not.

  if os.path.exists(newDir) is not True:

    try:

      os.mkdir(newDir)

    except:

      print("newDir not writable!") 

  # Scan the folder for files.

  oldFiles = os.listdir(oldDir)

  # For each file within a folder...

  for oldFile in oldFiles:

    oldFilePath = os.path.join(oldDir, oldFile)

    # Make sure the file is real and not just a symlink.

    if (os.path.isfile(oldFilePath)):

      # Apply our filters to the input file.

      newFile = cleanFile(oldFile)

      print newFile

      newFilePath = newDir + '/' + newFile

      newFilePath = newFilePath.replace("   ", " ")

      newFilePath = newFilePath.replace("  ", " ")

      # Check that the target file doesn't already exist.

      if (os.path.isfile(newFilePath)):

        # Increment the filename if a file already exists with the same name.

        newFilePath = newDir + '/1_' + newFile

        os.rename(oldFilePath, newFilePath)

      if (os.path.isfile(newFilePath)) is not True:

        # Copy the file to the new directory.

        os.rename(oldFilePath, newFilePath)

  # After all files are processed try to delete the folder.

  # We could use shutil.rmtree instead but for this it's better to have errors with duplicates

  # that can be corrected instead of errors and deleted originals.

  if oldDir != newDir:

    try:

      os.rmdir(oldDir)

    except:

 

      print("Cannot Delete oldDir!")

When we put it all together it looks something like this.....

# HR_Media_Organizer

import sys, os, re

inputDir = '/home/justin/Desktop/testDir'

 

# Create an array of folders. 

files = os.chdir(inputDir)

 

# A function to prepare the folder name using our filters.

def cleanDir(string):

  # Define the dictionary of {'matches': 'replacements'}

  replacements = {'()': '', '[': '', ']': '', '{': '', '}': '', '.': ' ', '(': '', ')': '', 'BrRip': '', 'BRRip': '', 'XviD': '', 'BluRay':'', 'YIFY': '', '[YTS.AG]': '', '[YTS.PE]': '', 'HDTS': '', '720p': '', 'x264': '', 'AC3': '', '-': '', '1080p': '', ',': ''}

  # Place longer ones first to keep shorter substrings from matching where the longer ones should take place

  # For instance given the replacements {'ab': 'AB', 'abc': 'ABC'} against the string 'hey abc', it should produce

  # 'hey ABC' and not 'hey ABc'

  substrs = sorted(replacements, key=len, reverse=True)

  # Create a big OR regex that matches any of the substrings to replace

  regexp = re.compile('|'.join(map(re.escape, substrs)))

  # For each match, look up the new string in the replacements

  return regexp.sub(lambda match: replacements[match.group(0)], string)

 

# A function to prepare the filename using our filters.

def cleanFile(string):

  # Separate the last 4 characters (the file extension in the case of media, images, and subtitles which are all we need).

  stringExt = string[-4:]

  string = string[:-4]

  # Define the dictionary of {'matches': 'replacements'}

  replacements = {'()': '', '[': '', ']': '', '{': '', '}': '', '(': '', ')': '', 'BrRip': '', 'BRRip': '', 'XviD': '', 'BluRay':'', 'YIFY': '', '[YTS.AG]': '', '[YTS.PE]': '', 'HDTS': '', '720p': '', 'x264': '', 'AC3': '', '-': '', '1080p': '', '.': ' '}

  # Place longer ones first to keep shorter substrings from matching where the longer ones should take place

  # For instance given the replacements {'ab': 'AB', 'abc': 'ABC'} against the string 'hey abc', it should produce

  # 'hey ABC' and not 'hey ABc'.

  substrs = sorted(replacements, key=len, reverse=True)

  # Create a big OR regex that matches any of the substrings to replace.

  regexp = re.compile('|'.join(map(re.escape, substrs)))

  # For each match, look up the new string in the replacements and re-add the extension.

  return regexp.sub(lambda match: replacements[match.group(0)], string).rstrip(' ') + stringExt

# Verify that the target folder exists and is writable.

if os.access(inputDir, os.W_OK) is not True:

  print("inputDir not writable!")

  sys.exit()

 

# For each folder...

for dir in os.walk(inputDir):

  # Apply our filters to the input folder.

  oldDir = dir[0]

  newDir = cleanDir(oldDir)

  newDir = newDir.replace("   ", " ")

  newDir = newDir.replace("  ", " ")

  # Check if a folder already exists and create one if it does not.

  if os.path.exists(newDir) is not True:

    try:

      os.mkdir(newDir)

    except:

      print("newDir not writable!") 

  # Scan the folder for files.

  oldFiles = os.listdir(oldDir)

  # For each file within a folder...

  for oldFile in oldFiles:

    oldFilePath = os.path.join(oldDir, oldFile)

    # Make sure the file is real and not just a symlink.

    if (os.path.isfile(oldFilePath)):

      # Apply our filters to the input file.

      newFile = cleanFile(oldFile)

      print newFile

      newFilePath = newDir + '/' + newFile

      newFilePath = newFilePath.replace("   ", " ")

      newFilePath = newFilePath.replace("  ", " ")

      # Check that the target file doesn't already exist.

      if (os.path.isfile(newFilePath)):

        # Increment the filename if a file already exists with the same name.

        newFilePath = newDir + '/1_' + newFile

        os.rename(oldFilePath, newFilePath)

      if (os.path.isfile(newFilePath)) is not True:

        # Copy the file to the new directory.

        os.rename(oldFilePath, newFilePath)

  # After all files are processed try to delete the folder.

  # We could use shutil.rmtree instead but for this it's better to have errors with duplicates

  # that can be corrected instead of errors and deleted originals.

  if oldDir != newDir:

    try:

      os.rmdir(oldDir)

    except:

      print("Cannot Delete oldDir!")

And there you have it! This script will turn a folder named "Jigsaw (2017) [1080p] [YTS.AG]" into "Jigsaw 2017," which is exactly what I want. You might want something different so you certainly shouldn't attempt to run this script right out of the box without testing some samples. Obviously there is no warranty on this and if you mess up anything I take no responsibility. I just want to illustrate how approachable programming really is, and that programatically solving problems is oftentimes a lot easier than doing things manually. It's simply a matter of establishing a repeatable system that you can describe, and then putting that description into your syntax of choice. 

In The Next Issue...

Learn how to un-Google your life by self-hosting as many things as possible!

Justin Grimes (@zelon88)

HonestRepair
Rowley MA, USA
Facebook
All work licensed under GPLv3.
To change your subscription, click here.