<?xml version="1.0"?>
<News hasArchived="false" page="1" pageCount="1" pageSize="10" timestamp="Wed, 22 Apr 2026 00:56:43 -0400" url="https://beta.my.umbc.edu/groups/csee/posts.xml?tag=image">
<NewsItem contentIssues="true" id="141414" important="false" status="posted" url="https://beta.my.umbc.edu/groups/csee/posts/141414">
<Title>Talk: Rigorous measurement in text-to-image systems, 4/29</Title>
<Tagline>4-5pm ET Monday, April 29 in ENGR 231 &amp; Webex</Tagline>
<Body>
<![CDATA[
    <div class="html-content"><div><h4><strong><br></strong></h4><h4><strong>Rigorous measurement in text-to-image systems (and AI more broadly?)</strong></h4><div><br></div><div><a href="https://saxon.me/" rel="nofollow external" class="bo"><strong>Michael Saxon</strong></a></div><div><strong>University of California, Santa Barbara</strong></div><div><strong><br></strong></div><div><strong>April 29, 2024 4:00 – 5:15 PM ET</strong></div><div><strong>ENGR 231 and <a href="https://umbc.webex.com/meet/gokhale" rel="nofollow external" class="bo">Webex</a></strong></div><div><br></div><div>As large pretrained models underlying generative AI systems have grown larger, inscrutable, and widely-deployed, interest in understanding their nature as emergent rather than engineered systems has grown. I believe to move this "ersatz natural science" of AI forward, we need to focus on building rigorous observational tools for these systems, which can characterize capabilities unambiguously. At their best, benchmarks and metrics could meet this need, but at present they are often treated as mere leaderboards to chase and only very indirectly measure capabilities of interest. This talk covers three works on this topic: first, a work laying out the high-level case for building a subfield of "model metrology" which focuses on building better benchmarks and metrics. Then, it covers two works on metrology in the generative image domain: first, a work which assesses multilingual conceptual knowledge in <a href="https://en.wikipedia.org/wiki/Text-to-image_model" rel="nofollow external" class="bo"><strong>text-to-image</strong></a> (T2I) systems, and second, a meta-benchmark that demonstrates how many T2I prompt faithfulness benchmarks actually fail to capture the compositionality characteristics of T2I systems which they purport to measure. This line of inquiry is intended to help move benchmarking toward the ideal of rigorous tools of scientific observation.</div><div><br></div><div><strong><a href="Michael%20Saxon" rel="nofollow external" class="bo">Michael Saxon</a></strong> is a PhD candidate and NSF Fellow in the NLP Group at the University of California, Santa Barbara. His research sits on the intersection of generative model benchmarking, multimodality, and AI ethics. He’s particularly interested in making meaningful evaluations of hard-to-measure new capabilities in these artifacts. Michael earned his BS in Electrical Engineering and MS in Computer Engineering at Arizona State University, advised by Visar Berish and Sethuraman Panchanathan in 2018 and 2020 respectively.</div></div></div>
]]>
</Body>
<Summary>Rigorous measurement in text-to-image systems (and AI more broadly?)     Michael Saxon  University of California, Santa Barbara     April 29, 2024 4:00 – 5:15 PM ET  ENGR 231 and Webex     As...</Summary>
<Website>https://www.tejasgokhale.com/seminar.html</Website>
<TrackingUrl>https://beta.my.umbc.edu/api/v0/pixel/news/141414/guest@my.umbc.edu/865d0e818fb35debfc44e3d3e0813c60/api/pixel</TrackingUrl>
<Tag>ai</Tag>
<Tag>image</Tag>
<Tag>llm</Tag>
<Tag>text</Tag>
<Tag>text-to-image</Tag>
<Tag>vision</Tag>
<Group token="csee">Computer Science and Electrical Engineering</Group>
<GroupUrl>https://beta.my.umbc.edu/groups/csee</GroupUrl>
<AvatarUrl>https://assets3-beta.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/xsmall.png?1314043393</AvatarUrl>
<AvatarUrl size="original">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/original.png?1314043393</AvatarUrl>
<AvatarUrl size="xxlarge">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/xxlarge.png?1314043393</AvatarUrl>
<AvatarUrl size="xlarge">https://assets4-beta.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/xlarge.png?1314043393</AvatarUrl>
<AvatarUrl size="large">https://assets3-beta.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/large.png?1314043393</AvatarUrl>
<AvatarUrl size="medium">https://assets1-beta.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/medium.png?1314043393</AvatarUrl>
<AvatarUrl size="small">https://assets2-beta.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/small.png?1314043393</AvatarUrl>
<AvatarUrl size="xsmall">https://assets3-beta.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/xsmall.png?1314043393</AvatarUrl>
<AvatarUrl size="xxsmall">https://assets3-beta.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/xxsmall.png?1314043393</AvatarUrl>
<Sponsor>Computer Science and Electrical Engineering</Sponsor>
<ThumbnailUrl size="xxlarge">https://assets4-beta.my.umbc.edu/system/shared/thumbnails/news/000/141/414/2e89f5ef5c07dcb98fe19a39c915f9ec/xxlarge.jpg?1714225177</ThumbnailUrl>
<ThumbnailUrl size="xlarge">https://assets3-beta.my.umbc.edu/system/shared/thumbnails/news/000/141/414/2e89f5ef5c07dcb98fe19a39c915f9ec/xlarge.jpg?1714225177</ThumbnailUrl>
<ThumbnailUrl size="large">https://assets1-beta.my.umbc.edu/system/shared/thumbnails/news/000/141/414/2e89f5ef5c07dcb98fe19a39c915f9ec/large.jpg?1714225177</ThumbnailUrl>
<ThumbnailUrl size="medium">https://assets4-beta.my.umbc.edu/system/shared/thumbnails/news/000/141/414/2e89f5ef5c07dcb98fe19a39c915f9ec/medium.jpg?1714225177</ThumbnailUrl>
<ThumbnailUrl size="small">https://assets1-beta.my.umbc.edu/system/shared/thumbnails/news/000/141/414/2e89f5ef5c07dcb98fe19a39c915f9ec/small.jpg?1714225177</ThumbnailUrl>
<ThumbnailUrl size="xsmall">https://assets1-beta.my.umbc.edu/system/shared/thumbnails/news/000/141/414/2e89f5ef5c07dcb98fe19a39c915f9ec/xsmall.jpg?1714225177</ThumbnailUrl>
<ThumbnailUrl size="xxsmall">https://assets1-beta.my.umbc.edu/system/shared/thumbnails/news/000/141/414/2e89f5ef5c07dcb98fe19a39c915f9ec/xxsmall.jpg?1714225177</ThumbnailUrl>
<PawCount>0</PawCount>
<CommentCount>0</CommentCount>
<CommentsAllowed>true</CommentsAllowed>
<PostedAt>Sat, 27 Apr 2024 09:43:46 -0400</PostedAt>
</NewsItem>

</News>
