Whenever working on a digital project, I try to be open to the fact that my initial plan will not always be possible to follow through all the way to the end. I’ve often had issues with my code since I’m primarily self-taught, or sometimes things just don’t turn out the way I imagined once I pair the technology with my content. I’ve learned to accept these little failures as part of the process since I have started using digital tools and methods in my scholarship, but last month I had a new experience: the platform I was working with stopped running the way I needed it to.
At the beginning of the semester when I started using Voyant for topic modeling, I would get results that looked like this:
This may not have been the most visually appealing, but it got the job done. I could run the topics as many times as I needed for my project. However, with my revised work plan I ended up building my site’s infrastructure instead. When I returned to Voyant at the beginning of March, the topic modeling feature had been updated.
Now, Voyant’s topic modeling interface looks like this:
I would argue this is much more visually appealing, but unfortunately that is not enough. Now, after the initial topics automatically derived from your corpus upon upload, you can no longer run additional topics as the button is now inactive.
Bugs happen–especially when changing or updating your code. I should have been checking in with my topic modeling earlier. I’m sure the Voyant team will eventually get around to fixing this, but I’m at a point in my CHI project where I can’t move forward until I get through the topic modeling portion. So, I had to find a quick yet effective solution elsewhere. This is where the Topic Modeling Tool (TMT) has come to my rescue.
The TMT is available free through GitHub and is a Graphical User Interface for MALLET. I was able to easily install it on my laptop, and the provided documentation as well as Miriam Posner’s tutorial were incredibly helpful in learning to use this tool.
Perhaps the biggest shift in using the TMT after Voyant was the formatting of my corpus. While with Voyant I could upload a .doc, I had to create .txt files for TMT. Formatting was a bit difficult at first, but once I was able to clean up the text further and could run some test topics I was able to understand what my corpus files needed to look like for the tool to work its best. There were also a few quirks I needed to learn to navigate, such as the inclusion of the .DS_Store file which has to be deleted from the folder before I run the topic models. Ultimately, however, the TMT was exactly what I needed to get back on track.
This experience has opened my eyes to something that I thought I already understood. Throughout my career in the digital humanities, sustainability has always been a big topic of discussion. I always consider that digital resources are dynamic and can be updated, changed, deactivated, or stopped being maintained at any time. While I understand this is the exact scenario I thought I knew so much about, since I’ve worked with Voyant for almost seven years I seemed to forget that it might someday stop working in the way I expected it to. Even though at first the only thing I could think about was how inconvenient this was for my project, I’ve realized I learned an incredibly valuable lesson about something I thought I already knew.
This post was originally published at https://chi.anthropology.msu.edu/news-updates/