Week 10 (July 18 - July 22, 2022)
It’s hard to believe I’ve come to the end of my DREU! The past 10 weeks have really flown by.
It’s hard to believe I’ve come to the end of my DREU! The past 10 weeks have really flown by.
This week, I presented my project to Professor Ordóñez-Román, Veronica, and Ziyan. I put together an abstract based on my presentation to submit to this year’s Grace Hopper Celebration.
This week I experimented with reimplementing my distinct regions algorithm to use Euclidean color distance to measure color similarity of regions. The code is also reimplemented to compute mean region color from scratch, since we discovered in past weeks that the pymeanshift output floodfilled region colors can be a bit off. Unfortunately, the algorithm still performs poorly on black-and-white images. For example, below you can see that the black-and-white image requires a much stricter color similarity threshold for the algorithm to count the same number of distinct regions as it would for the RGB version of the image. In other words, it’s much easier for regions in black-and-white images to be counted as nondistinct. This is probably due to the fact that black-and-white pixel values can only range from 0-255 in a single channel (duplicated into 3 channels before counting distinct regions), whereas RGB images have three channels with independent values for each channel. Therefore, there’s much more room for color variation in a color image. For a black-and-white image like the zebra below, regions in the tree foliage and grass, for example, are much more easily considered to be similar size and color than in the RGB version.
This week I finally flew to Rice and began in-person work. It has been nice exploring campus the past few days and meeting Professor Ordóñez-Román, Ziyan, and Veronica. Rice’s campus is very different from Columbia’s and very beautiful.
This week, I trained BERT to classify images as complex v. noncomplex from their captions.
I faced some personal and research-related challenges this week. Due to medical reasons, I had to postpone going to Rice in-person until later this month. On the plus side, I made significant progress on my project in the past few days.
This week, I continued investigating automated measures of visual complexity. Two of the measures I used to rate the complexity of images from the SAVOIAS and MS-COCO datasets were feature congestion and number of regions (the latter calculated after segmenting images via mean-shifting).
This week I reviewed several possible datasets which we could use to analyze visual complexity: Conceptual Captions 3M and 12M, LAION-400M, SAVOIAS, MS-COCO, and SBU. I recorded how images and captions for each of these datasets were collected to determine which would be most appropriate for my project.
This week I solidified my understanding of BERT and the Transformer architecture. With the help of some tutorials from Professor Ordóñez-Román, I practiced finetuning models like ResNet and BERT for image and text classification on datasets like SUN and MM-IMDb. We spent some of our meetings this week reviewing these tutorials.
This week, I met with Professor Ordóñez-Román to discuss possible projects for the summer and set up regular meeting times for every week. Right now we’re meeting twice per week online (plus communicating via email), but I’m looking forward to coming to Rice in June.