CDC Forum: Finding a Needle in a Haystack: Enterprise-wide FOIA Searches - May 6, 2021
May 13, 2021 19:34 · 18572 words · 88 minute read
- [Event Producer - Michelle] Ladies and gentlemen, welcome.
00:04 - And thank you for joining today. Finding a Needle in a Haystack: Enterprise-wide FOIA Searches at CDC webinar.
00:14 - Before we begin, please ensure that you have opened the WebEx participants and chat panel by using “Say something” the icon located at the bottom right-hand side of your screen.
00:27 - Please note all audio connections are currently muted and this conference is being recorded.
00:33 - You are welcome to submit written questions throughout the webinar, which will be addressed at the Q&A session of the webinar.
00:41 - To submit a written question, select all panelists on the drop down menu in the chat panel, Then enter your question in the message box provided and send.
00:51 - If you require technical assistance, please send a track to the event producer.
00:58 - With that, I will turn the webinar over to Alina Semo.
01:02 - Director Office of Government Information Services.
01:07 - Alina please go ahead. - Thank you, Michelle.
01:11 - Good morning, everyone. My name is Alina Semo and as the Director of the Office of Government Information Services at the National Archives and Records Administration.
01:20 - It is my pleasure to welcome all of you to our event today.
01:24 - Titled, Finding a Needle in a Haystack: Enterprise-wide FOIA Searches at the CDC.
01:30 - I hope everyone who’s joining us today has been staying safe, healthy and well.
01:35 - Shortly I will go through some basic housekeeping rules and set some expectations for today’s meeting.
01:41 - First I would like to give you some background on today’s event and how OGIS became involved.
01:47 - As many of you know, OGIS is the federal FOIA ombudsman and in that role we work to improve the FOIA process in a number of different ways.
01:56 - By reviewing agency compliance, by offering dispute resolution services to assist requesters and agencies.
02:03 - By chairing and managing bodies like the FOIA Advisory Committee and co-chairing the Chief FOIA Officers Council and more.
02:11 - In that role OGIS has a unique perspective on FOIA programs across the federal government landscape.
02:19 - For the last 14 months, we have been watching with interest the impact of the pandemic on agencies FOIA program.
02:26 - Just over a year ago, OGIS was pleased to host our first CDC live webinar, FOIA Requests for CDC COVID-19 Records.
02:36 - Once again, this year, the CDC FOIA program managers brought our assistants to speak directly to all of you about how the CDC conduct enterprise-wide searches in response to FOIA requests.
02:49 - You will be hearing today from Srinath Tutukuri who is the IT Project Manager for the CDC FOIA program.
02:57 - Srinath, is joined by the CDC FOIA Director, Roger Andoh and the CDC Deputy FOIA Director, Bruno Viana.
03:05 - The PowerPoint for today’s presentation is accessible on the OGIS website at archives. gov/ogis.
03:15 - We will also add it to the chat. Throughout this morning we will be monitoring the chat function on WebEx.
03:23 - We are also simultaneously live streaming on the NARA YouTube channel and also monitoring the chat submitted on that platform.
03:32 - We will be taking questions throughout the presentation.
03:35 - So as you think of questions, please type them using the chat function on either platform.
03:41 - Our plan is to pause periodically, to check in and see if there are any questions that have come in via chat.
03:48 - And we will also open up our telephone lines on WebEx during those pauses to give attendees the opportunity to ask any questions orally.
03:57 - An important reminder with regard to your to questions, please be aware that this is not the right time to ask questions without a specific FOIA request.
04:07 - We’re happy to have all points of view shared, but please respect your fellow attendees and keep the conversation civil and on topic.
04:15 - We will do our best to answer all of your chat and telephone questions.
04:20 - If we do not get to your question, please don’t worry we will post any unanswered questions and answers on the OGIS website in the upcoming days.
04:29 - We are recording today’s session and we will post a video and transcript of this event on the OGIS website as soon as it becomes available.
04:39 - I also want to take this opportunity to speak to those of you joining us from other federal agencies FOIA programs.
04:45 - The CDC FOIA program has been proactive in communicating with their stakeholders using this venue.
04:52 - OGIS is happy to help any other agency FOIA program to host similar events.
04:58 - If you are interested, please send us a chat during today’s event, call us at (202) 741-5470 or email us at ogis@nara. gov.
05:10 - We look forward to hearing from you. At this time, I would like to welcome our main presenter today, Srinath Tutukuri who was also joined, as I mentioned earlier by CDC FOIA Director, Roger Andoh and CDC FOIA Deputy Director Bruno Viana.
05:28 - Srinath, is the IT Project Manager in the CDC FOIA office.
05:33 - He primarily takes care of managing the enterprise searches in addition to also being responsible for FOIA’s IT infrastructure at the CDC.
05:43 - He has been in this role for more than six months during which he has explored various tools and options for improved enterprise searches.
05:51 - During this presentation, he will present firsthand information on enterprise search process, the tools they used, potential issues and finally tips to scope search requests for optimal results.
06:06 - Srinath, over to you now. - Thank you Alina.
06:10 - Good morning, everyone. I’m Srinath Tutukuri , IT Project Manager here at the CDC FOIA office.
06:17 - Today, I’ll be doing a presentation related to how we perform enterprise searches at the CDC FOIA office, in addition to the issues that we run at the FOIA office when we try to run the searches.
06:30 - And finally some tips and recommendations that we feel can help us get better search results and even probably take some advice and input from user community and come up with better search users which would help everyone in the long run.
06:44 - Having said that, I would like to go to the next slide which is the agenda of this meeting.
06:52 - Before I get started with the agenda of this meeting.
06:55 - Let me go with to important points that I need to tell.
06:58 - The first is, what are the capabilities that we have at the CDC FOIA office? The capabilities that we have are, number one is, we have access to search on all the email addresses within CDC domain, so that comes to around five to 10,000 email boxes.
07:17 - In addition to this capability, we also have another capability where we can search for document on all the shared drive within the CDC’s network.
07:28 - Having said that we do have some limitations on this.
07:31 - The first limitation is that we cannot run a wildcard search on any of the mailboxes.
07:38 - And we also definitely need to take some mandatory approvals from the custodians of this mailboxes in order for us to be able to perform any searches.
07:47 - And second is the same with any of this shared drives too.
07:52 - We need to take the approval and be granted access on the share drive before we can search for any document and locate any documents if there are any.
08:01 - Hopefully this duty and understanding that we do have limitations when we run the search processes and we cannot just simply run a search on all the mailboxes at CDC, and we have limitations that we have to only run searches on restricted mailboxes and a group of mailboxes.
08:20 - We also cannot run searches on a whole division or a CIO if there are hundreds of people.
08:27 - Hopefully this gives a clear understanding before we can delve deeper into how the agenda of this meeting actually.
08:36 - So the agenda of this meeting is being divided into four categories.
08:40 - The first category is an overview of ES which is known as enterprise search.
08:45 - The second category is how we categorize these requests based on the technical complexity.
08:51 - And the third is the issues that we run into when we perform this enterprise searches.
08:57 - And the last is, what are the improvements that we would suggest based on the observations that we have seen when we perform this enterprise searches.
09:06 - And finally, we’ll also have a Q&A session over this particular aspect.
09:12 - Before I move to the next slide. Does anyone have any questions? - [Michelle] Ladies and gentlemen, if you’d like to ask a question via phone, please press pound two on your telephone keypad to enter the question queue.
09:30 - Once again, pressing pound two will enter you into the question queue or you may enter your question into the chat box.
09:39 - - [Alina] So right now we have no questions from chat, so go ahead.
09:44 - - Thank you, Alina Let me go to the next slide.
09:51 - So the first category which talks about enterprise search overview is split into three categories.
09:56 - The first category talks about the process flow.
10:00 - The second category talks about the tools that we use.
10:03 - And the third category is one of the most important features that we use to really find searches, which is known as Dedupe and Containment.
10:13 - Any questions on this three slides so far? - [Michelle] Once again, pressing pound two will enter you into the question queue I don’t see any questions on the line or questions in the chat? - Sure, sure.
10:35 - Let me go to the next slide, the ES Process Flow.
10:41 - So this slide pretty much gives an bird’s-eye view or a complete understanding of how the enterprise search process is performed at the FOIA office here.
10:54 - So even before we get an enterprise search into the technical team, the enterprise search is pretty much analyzed and vetted by the FOIA analysts to make sure that relevant information is present.
11:08 - The most key information that we need to perform an enterprise search is, one, the custodian email mailboxes and the second is the time span.
11:17 - Without these two pieces of information, we cannot proceed with any enterprise searches.
11:24 - Just in case the requester has not provided us the custodian email boxes our FOIA team contact the relevant subject matter experts and get us the relevant custodian details for us to perform the search.
11:39 - The same goes with the time span too if there is no time span, the subject matter experts provide us the time span information too.
11:46 - So once this information is provided to us, we analyze the search request to see if it needs any keywords.
11:55 - Sometimes the keywords are also provided by the requester.
11:59 - Sometimes if there are no keywords, they come in from other subject matter experts, or if there are no keywords given one of my team members go through the request and gathers and understand the search request and comes up with the keywords to perform a search.
12:16 - Any questions so far on how this analyze search aspect of enterprise search? - [Michelle] I do not see any questions on the line.
12:30 - And no questions in chat. - Thank you.
12:34 - Let me go to the next step in the enterprise search process.
12:37 - So once we have the keywords, the custodian mailboxes and the time span defined, we take this information and plug it into our primary search tool which is the Microsoft Office 365 Compliance.
12:51 - I will be going over the capabilities of this tool in the future slides.
12:57 - But for now, we simply take these details or enter this information as features and put it into the Office 365 Compliance tool, which is a graphical user interface, which pretty much hooks into the Microsoft Exchange Server and brings us out all the emails from the exchange server.
13:18 - Once this information is available to us we have some next steps that we follow.
13:25 - But before I go to the next step, does anybody have any questions related to this aspect on Office 365 so far? - [Alina] No questions in chat.
13:38 - - [Michelle] And no questions on the phone.
13:40 - - Sure, thank you. Generally, the next step that we follow is we try to eliminate clutter.
13:46 - However, we do not eliminate any clutter because if the requester has specifically stated that we’re not interested in a subscription and newsletter, we hold back and we simply perform the search.
14:01 - But if the requester has explicitly given us instructions that they’re not interested in a subscription or newsletters, we try to eliminate that detail too.
14:14 - I do see one question regarding what is NUIX here.
14:18 - So NUIX is a forensic software which is used for analyzing the data, which I’ll be going over in the next section, specifically where I’ll be talking about different tools actually.
14:33 - So once we perform the search and we get the necessary search results, and if we have to eliminate any clutter, we go ahead and eliminate any subscription based emails based on the records that we see.
14:48 - And then we do re-run the search. After re-running the search, one of our analysts goes there and samples the data to make sure that the results meet the expectation of, are within the scope of the request.
15:02 - So we do capture some metrics regarding what kind of records have been captured for each keywords or how many records are coming from a single mailbox or different custodians and so forth.
15:17 - So we have all this information which is captured.
15:20 - And if the records are very less and we are certain that the search request is a simple request and there’s not much of ambiguity in the results that we see.
15:34 - We go ahead with the next steps of exporting the data and preparing the data and finally presenting the data.
15:41 - However, if we see that the result look ambiguous and we have a lot of data, we present it to the analyst with all the necessary insights for them to make or probably contact the requester and make an informed decision.
15:55 - And if really the matter of the scope is required to come down with less number of records.
16:01 - And then we probably re-run the search to get some better visits.
16:05 - However, if then we feel that the records are good enough, we go ahead and export this data into either a PDF and a doc format or a message or an email and sometimes even to a PST record.
16:19 - Occasionally even things that graphical user interface has some issues that we can’t really delve much deeper into each record to understand and see if the record are that good.
16:31 - We sometimes take this data and put it in Outlook to get a better insight and awareness of how the data looks.
16:37 - Once we feel that the data is good enough, we go ahead and prepare this data.
16:42 - So the preparation of the data is where we talk about dedupe and containment.
16:47 - At a specific section, we’ll talk about dedupe and containment in detail, but in a nutshell, what the dedupe containment does is that it eliminates lots of duplicate data and it cuts on the volume of the results, by about 20 to 40% based on what we have seen so far.
17:04 - But it helps us to make the records more concise, that way it saves us the time, as well as the requester saves the time then we are not presenting them duplicate information.
17:16 - And the last step is that we do present this data and we take this data and put it into the shot five shapes and also into our case management tool, which is known as the FOIAXpress.
17:29 - And from here on, I pass on the data back to the analyst and the analyst takes it over from here.
17:35 - He brings this case to a logical conclusion and final closing of the case.
17:41 - And that is the overview of actually how the whole enterprise search process is performed from a technical perspective at the CDC FOIA office.
17:52 - So does anyone have any questions regarding the process flow here? - [Alina] No new questions in the chat so far.
18:04 - - I do see one. - [Michelle] I do see one.
18:07 - Yeah, Srinath go ahead. I was gonna say you see the same question I do right? - Yes, yes I do see one question here.
18:12 - “Have you ever had any email failed to properly “import into FOIAXpress?” Yes, sometimes very rarely do we see, we run into issues where some of the emails failed to load into FOIAXpress.
18:25 - In that instance what we do is that we take the email message out and we try to reformat it into PDF manually.
18:33 - But yes, we do run into occasions once in a while, but it is not very prevalent.
18:46 - Any additional questions? - [Participant] On that individual clarify there more specifically, they come in as their native format, rather than the proper format.
18:56 - Maybe you could talk a little bit about the format that they’re gonna come into.
19:02 - - Yeah, so usually most of the emails come in the proper format and so we don’t get any format which are not non-English specific or not-key specific.
19:13 - So we never really run into some of these issues.
19:16 - However, one issue we do occasionally is that when emails are encrypted it prevents those emails from being converted into PDF documents.
19:28 - So what we do is that we have to sometimes take those encrypted emails and probably even figure out a way of either going back to the requester to get that email for us, and the encrypted email, and probably put it back into FOIAXpress.
19:42 - in a different format and resolve the issue.
19:44 - So hopefully that answers the question. - [Michelle] All right and there were no questions on the phone.
19:56 - - Yeah, I do see another question, “Can you repeat the prepare data part?” Yeah sure, definitely.
20:01 - So what we do in the prepare data process is that once we have all the data exported from Office 365, which is usually either in individual messages.
20:11 - or it’s in. pst file, which is then, think of it as a zip file.
20:17 - with all the different messages. We take this data and put it into a software which we use, which is called the Case Management Software FOIAXpress.
20:24 - And we run this data into that software which is known as dedupe and containment.
20:30 - When we do the dedupe and containment what really happens is that all duplicate messages are eliminated.
20:38 - So to give an example, is that let’s say I’m running a search on five custodian email boxes and I have sent an email to five people and five of them are the CC.
20:50 - During the dedupe process what happens is that it only takes one email, one unique email rather than the five emails, that way It saves us four records eliminated from the set.
21:03 - And when I talked about containment, what it really means is that let’s say there’s a conversation between myself and another user, and we have 15 different emails going back and forth.
21:14 - What the containment process really does is that it eliminates all the individual emails and gets the last email of the email chain.
21:22 - So what really happens is that it saves us from having to go through each individual email and it eliminates the emails which are contained within the final email.
21:31 - So typically what we have observed, as I said in the past, is that the dedupe and containment process reduces the record volumes by 20 to 40% while making sure that the scope of the search result is still in depth.
21:49 - Hopefully this answers your question Adrian.
21:59 - Yes for dedupe and containment we can use different software.
22:03 - We use Office 365 ,sometimes on some dedupe process NUIX can do dedupe and Outlook can also do, sometimes we are capable of eliminating some records, but we primarily use the FOIAXpress for containment actually.
22:20 - So to answer your question, yes, all the records, we definitely go through the FOIAXpress software for the containment and it’s extensively used.
22:29 - But every private search goes to the containment.
22:33 - The exceptions would be that we have five or 10 records then there’s no necessity to really do the containment process.
22:43 - Let me go to the next slide. And I did go over some of the tools that we spoke about during the previous slide.
22:57 - But I can go over each of the different tools that we use here at the FOIA office to make sure that we are able to get the best results out.
23:07 - So any search that that is performed at the FOIA office primarily first goes to the Microsoft 365 Compliance tool.
23:17 - So the way it runs is that, as I said, this tool is a graphical user interface, which is a web based interface, which has filters to perform searches.
23:28 - The different filter options that we have are, it gives an ability to search on keywords, then the subject of the email, the recipients of this email, the participants of an email as well as who the sender is, and the most important aspect is running the search and the custodians of this email.
23:48 - And finally a date range. Without the custodians and date range we do not do research because it’s going to be a wild goose chase.
23:55 - and maybe the search can take forever, and we’re not going to get any productive results actually.
24:02 - As I said, the tool is very simple and it gives us insight.
24:06 - It’s a first step for us to really get all the data.
24:09 - And based on our observation, if the scope is very defined and the record count is less, we do get our records are much more precise and we feel confident at this level, we just go ahead and keep running the search, this records in any other tool.
24:28 - However, if we do get lots of data and we feel that the search does not look that great or the results can be ambiguous at times.
24:40 - And if the record count is less than like a hundred or 50 records.
24:44 - We simply take the records, all this data in a PST file and quickly analyze it in Outlook.
24:49 - And that is then a very quick way of looking at the records for us to analyze, the data looks good that probably may eliminate any records, which are not needed.
24:58 - And usually we sell them both to Outlook process, but if required, we do it, if the record call volumes are left.
25:08 - And the next step is that we do not really go into FOIAXpress to have to do any searches.
25:12 - And what we do is we have been using this forensic software called NUIX , and this software has higher capabilities than the Office 365 software, as well as Outlook.
25:23 - And it can really provide much more insights into the record.
25:29 - And it is capable of doing some containment, then additional dedupe which Office 365 process fails then the record volumes are much more higher.
25:38 - And it gives us an insight into the data and helps us understand if the record clip counts.
25:43 - So what we see is that once we have like a big set of records of around 10,000 records and they run this on NUIX it cuts down the data volumes and it’s much more precise and it uses a lot of options to do in types like it groups the data based on subsets groupings dates and topics, and it also gives us different domains and how many emails are coming in from each domain.
26:11 - Then it also gives us a better ability to like cut down, let’s say the user feels that I’m only interested in all the emails sent from CDC.
26:22 - It helps us to narrow down the record. So it has different additional search filter options, which are not available in the Office 365.
26:29 - And we do use this tool on an as needed basis.
26:33 - But it is definitely a powerful forensic software where we can run pretty much data.
26:41 - We can pretty much run and analyze a lot of data, not just Outlook emails, even lot of hard drive based data and a lot of documents and so forth.
26:52 - Any questions on this so far on the tool specifically? - [Michelle] I do not see any question in the phone queue.
27:05 - Reminder, ladies and gentlemen, if you would like to enter into the phone queue press in pound two on your cellphone phone keypad will enter you into that queue - [Alina] I see no new question from the chat.
27:17 - - Thank you, I can move to the next slide. Let me talk about dedupe and containment.
27:30 - I did speak about it a few minutes back, but I can present a slide which I have captured from the FOIAXpress talk to its users and insights into how the dedupe went actually.
27:43 - So in this instance, we had it on 900 records, which were captured after running the first through Office 365, as well as the NUIX software.
27:54 - We knew that 903 records could be condensed further, and as I said, the dedupe process typically brings down the records by around 30%.
28:04 - So when we did run this 903 records through the dedupe process in FOIAXpress, what we found was that our interest is primarily on the green light green bar here.
28:18 - We want to see how many records it comes down to.
28:20 - So from 903 records, the records were condensed to 630 records here.
28:26 - And the next thing is that it also gives us the number of records which are eliminated as part of the containment process.
28:33 - So in this instance, 270 records are eliminated for the containment process.
28:38 - If it was not able to eliminate any duplicates the reason is because all this duplicates are only eliminated by Office 365 as well as NUIX.
28:46 - So we were able to condense the record volume from 903 to 630 which translates to a reduction of around 30 to 35 of records.
28:57 - Just give you an overview, each record on an average translates to around four pages.
29:02 - So in a perspective 600 records roughly goes around to those and find the biggest of data, which needs to be analyzed by the analysts again and present it to the final requester.
29:15 - So the containment process helps us in greatly actually, in reducing this volume of records while keeping the scope intact, and saves time for the analyst as well the job and requester.
29:32 - Any questions on this on dedupe and containment? - [Event Producer - Michelle] There are no questions on the phone line.
29:44 - - [Alina] No chat questions either, thank you.
29:47 - - Thank you, I can move to the next slide. Yeah, so that pretty much concludes the first section of the overview or the bird’s eye view of the enterprise search process sector.
30:01 - CDC for the different aspects that I did cover were the process flow, then the different tools that we use and also the dedupe process.
30:10 - Let me move to the next section, which is how the technical team categorizes the enterprise search process.
30:17 - I would like to make it clear that we also have another categorization the administrative aspects of enterprise searches, so I will not be going into that aspect.
30:26 - Here the categorization is primarily limited to the complexity that is involved from a technical aspect when we try to get search results.
30:35 - So the first, I’ve categorized this search into three different categories.
30:39 - One is the low intensity, the next to the moderate intensity and the last is in high intensity search.
30:46 - I would like to go to the next slide. Where we are talking about low intensity research.
30:51 - So when I say low intensity search, what it really means is that the searches are very simple to perform and we’re absolutely certain that they are getting the right results.
31:05 - And we can very quickly get the search done and Google the request in a timely fashion without any issues or without having to go back and forth with the requester.
31:16 - So, and at the side, the picture here, “keep it simple”.
31:20 - So generally when the requester try to keep it simple we know that the search is a very low intensity search.
31:27 - So what is really a low industry search? I have placed a few attributes which can redefine what the low intensity search is for.
31:35 - So we have the custodian mailboxes which are defined.
31:39 - So when I say the student mailboxes are defined, we know that the request we have to run a search on, a few custodian mailboxes, could the Director or an Assistant Director or probably the Head of Farm Division and things like that.
31:54 - So it’s very clear on who we’re running the search.
31:56 - The next is that we also have a very short time span.
32:00 - It’s a very important thing with the time span actually, because the shorter, the time span, our results are more accurate and more in line and in sync with the scope of the request.
32:11 - So it makes it through to the two weeks search it’s easy search a few days around an event, the search results are very precise.
32:18 - And the next is the number of participants.
32:20 - So if we know who the participants are, let’s say we have a very limited participants, like a discussion between a few individuals, four individuals, five individuals.
32:29 - Then it really helps us to narrow down the searches that makes the search results more accurate actually.
32:36 - And the last is not having any unambiguous keywords.
32:39 - When I say unambiguous keyword, we don’t expect a keyword like “Run a search on COVID” or “Run a search on autism” or “running search on AIDS” or such.
32:50 - The searches could be very big and we could get tons of record actually.
32:54 - That’s what I mean by unambiguous keyword. Any questions on this so far? - [Alina] We have one question on the chat.
33:03 - “Do you use NUIX as your primary method “to dedupe your records, and or do you think “this is a more efficient way to dedupe compared “to the ADR/EDR tools within FOIAXpress?” - We do not use NUIX to primarily dedupe the records.
33:22 - So we do use NUIX for deduping but the first step of dedupe always happens at Office 365.
33:29 - And if anything is missed during the dedupe process of Office 365 is captured in NUIX.
33:35 - And by enlarge, NUIX does a good job with dedupe.
33:38 - So we do not see any dedupe happening when we do the containment and dedupe process in FOIAXpress.
33:48 - But FOIAXpress does an excellent job with the containment process.
33:53 - And probably NUIX also has an ability to do the containment process, but we haven’t figured that out.
33:59 - As I said, we only started using it in the last three to four months.
34:07 - Does that answer your question? Okay, sure.
34:13 - If there are no questions related to the low intensity search.
34:16 - I can show an example of what a low intensity search is so that it gives an understanding of what I really mean by low intensity research.
34:25 - The next slide please. Here is an example of a low intensity search, and I’ll do like 15 seconds for you to go ahead and read the content of the low intensity search.
34:58 - Yeah, all right. So the requester here has requested for only email communication between the CDC Director and the office of the Vice President, Mike Pence within September 10th and October 1st.
35:13 - So the time span here is very short 20 days.
35:15 - Do you know who is the custodian here? It is the Director of the CDC.
35:20 - And we also know who are the participants here.
35:23 - So the participants are two people, one is the Director of CDC and it could also be anybody from the office of the Vice President.
35:31 - It could be a secretary or anybody sending those emails to us.
35:34 - So we have our mailbox defined, we have our dates defined and there is no necessity to do a keyword search here.
35:41 - And all we do is that we make sure that the participants are anyone from the email domain of the Vice President.
35:50 - So in this instance, all the participants could have a domain address of ovp. eod. gov.
35:58 - So that would be a partial content or suffix within the email address of any emails coming in from the office of the Vice President.
36:12 - So that pretty much will give us a very concise result, accurate results.
36:17 - And from here we can just take those results and straightly take those results and if the record count is very small, we just take those records and we run them through the dedupe and containment process within FOIAXpress and we are able to get the results in a very top short time span.
36:36 - And the analysts will be able to close off the research.
36:39 - So what I mean to say is that it’s a very simple search for us because the scope is very clear and there’s no ambiguity, and it makes it very easy on us to get searches done.
36:48 - So if our requester communities can provide us.
36:52 - searches which are very specific and very low intensity, it helps us in the long run.
36:58 - Any questions on this example? - [Event Producer - Michelle] I do not see any questions in the phone queue.
37:09 - - [Alina] Nothing on the chat, thank you. - Thank you, I can move to the next slide.
37:17 - So I’m going to talk about, the next slide is about moderate intensity search.
37:21 - So when I say a moderate intensity search, what I really mean is that sometimes the custodian mailboxes can be defined and then not be defined, but we have a way of figuring out who the custodians here are.
37:32 - And the participants may be known sometimes, or they may not be known, but typically in a medium intensity search we could have more number of participants too.
37:42 - And emails that have to run searches on group mailboxes like even busy inboxes or responses and so forth.
37:51 - And the search is not specific to a particular keyword.
37:55 - It could be a phrase search, or it could be a combination of keywords that need to be searched on.
38:00 - And generally the date range is larger, or it could be in a much longer date range at a time span.
38:07 - So when I say a moderate intensity search what it really means is that the record count is much more higher to be related within a hundred and a thousand of records.
38:16 - But we don’t know that they can definitely get the records here, but it involves some work on our end before we can really pinpoint them mail the accurate results, which are relevant for the scope of the request.
38:33 - Any questions on this moderate intensity search? - [Event Producer - Michelle] There are no questions on the line.
38:42 - - [Alina] Nope. - Thank you. I will put an example of a moderate intensity search.
38:48 - Probably I’ll take a few more minutes to really explain that example in much more detail.
38:54 - Next slide please. Thank you. Yeah, I’ll give 15 seconds for you to go ahead and read the content of the search All right, let me get started with this request.
39:45 - The written request relates to a news reporter who was interested in finding out the investigation that CDC has performed, related to an incident where a few people were infected with COVID when traveling on an bus from Milwaukee all the way to Texas, and apparently, unfortunately, an individual passed away too.
40:13 - So the event details, if you look at the email says it’s around October 13, 2020, when the event happened.
40:21 - So in this instance, we do not have any custodian mailboxes to search on, not really have a timeframe on when to perform the search.
40:30 - However, the only thing that we have from the request is, we are able to pick up the different keywords.
40:36 - One is COVID-19 then it’s a commercial bus, it has a reference to a particular company, at that company El Tornado, and a place which is Seneca Foods and some of the different stops like Laredo, Chicago, Wisconsin and so forth.
40:55 - So we do have sufficient keywords to probably even start off with a search here.
41:01 - So what happens here is that our analyst goes to the…
41:06 - Probably contacts the relevant subject matter experts and was able to identify the people who really performed this investigation.
41:15 - So we did get the custodian from them and they also provided us the timeframe on when to perform this search.
41:22 - So we had the custodians now, and we also had the timeframe.
41:28 - Any questions so far on this on this? - [Michelle] There are no questions on the line.
41:36 - - [Alina] No questions, yeah thank you. - Sure thank you.
41:38 - Please to the next slide. Yeah, so we did take the keywords, the custodian mailboxes and the date range.
41:53 - And we did perform a search on the Office 365 tool on the exchange server.
42:00 - So the way we did the search was that we had to come up with an concatenated phrase or come up with a set of keywords where we had to use either and the Boolean search of, and, or, or probably to draw some results and do some analysis on it.
42:17 - So what we really did was we said, let’s do a search based on Bus and any of these keywords, Milwaukee, or different places here in, Dallas and so forth, or it’s a motor coach and any of the different places here within this particular date range.
42:38 - So based on the search that we ran, we were able to get around I think it is, I don’t remember the exact figure, but it is a few hundreds of records.
42:47 - And I believe we had probably around three or four custodians whose mailboxes, we have to perform the search.
42:54 - So we came up with around four finder records, and we were quickly able to ascertain that yes this record look in sync with what the requester is looking for.
43:08 - I usually do what we do some sampling of a few records here and there to see if the records look relevant and the keywords look relevant.
43:15 - So in this instance, we were able to identify that within the subjects of each email, it says COVID-19 bus contact, online, convenience, bus investigation.
43:25 - The things which are highlighted in yellow shows those particular keywords we captured.
43:30 - And we also did see see some emails which were like newsletter and news articles reports and things like that.
43:42 - And news articles, which were not really relevant to the investigation.
43:47 - So we had to eliminate those, we call them clutter because these are nice emails which are not really relevant to the investigation.
43:57 - So we had to eliminate newsletters and user subscription emails, and things like that.
44:02 - And were able to cut down some of those results.
44:04 - But still we had a large number of records and we did more that Office 365 can sometimes be a little haywire here where it’s not really accurate.
44:16 - Office 365 generally is not going to be very accurate when you have multiple keywords and you have a combination of phrases and multiple custodians.
44:26 - So that is when we really need to go to the next level.
44:30 - So in this instance, after eliminating most of this noise emails.
44:35 - We took this record and we put them into NUIX.
44:37 - And NUIX did a much better job of eliminating some records which really didn’t make sense, because when we were doing phrase based searches and a combination of keywords, we did come up with records which were not relevant.
44:51 - So it cut down some off those records actually.
44:54 - So this is what a moderate intensity search is and Nicole to give us a lot of insight as well as last topics, it gave us groupings and topics.
45:03 - But in general, we were confident that the records that were coming out of NUIX what we were seeing was in line with what the requester was looking.
45:11 - But this definitely needed much more effort, it is not that simple search.
45:15 - And we needed to make sure that this record’s relevant and we just went back to the requester again based on the insight that the analyst had provided.
45:24 - And once we got the necessary approvals we moved forward with the dedupe and containment process, which again reduce around 20 to 30% of records.
45:32 - So this is what really a moderate intensity search looks like.
45:37 - So when I say moderate intensity search, the characteristics are that it has lots of mailboxes, the record volumes run into hundreds, and we definitely need to run this process through multiple software.
45:47 - And we definitely do some analysis. And it’s more than likely that we have to go back and forth through the requester before we can finalize the record.
45:56 - And any question? I do see the three questions on the chat.
46:02 - So let me try to go with the first question.
46:06 - - [Alina] So the first question was, “Are you referring to eDiscovery “when you’re using search tool in Office 365? “Is it an eDiscovery tool?” - “Are you referring to eDiscovery when you are using “the search tool Office 365?” Yes, it is the same thing we do use eDiscovery that’s right.
46:27 - - [Alina] Okay, great. and the next one, I’m not sure if you can answer or if this is going to be one for Roger, Bruno.
46:33 - “Could you speak briefly on how subject matter experts “and potential custodians are determined “before creating a search query?” Thank you.
46:45 - - This is Roger, I can do that question. So depending upon what state the scope of the request is.
46:51 - CDC has set up an emergency operation center to handle the coronavirus pandemic, and they have teams that are set up to address a specific address specific aspects of the pandemic.
47:09 - So you have, for example, folks who deal with the vaccine other folks who deal with the new sale order, various groups.
47:15 - So depending upon what the request is about then if its COVID related.
47:20 - We would send in a request where in this visual we have no custodians provided because probably the requester doesn’t even know who the custodians are.
47:29 - We send it to emergency operations, and then say, “Please give us the names of the folks “who be involved with this topic” that the requester is interested in.
47:39 - And so they would then identify either custodians or particular mailboxes being used by a team that would reasonably obtain the records requested.
47:55 - Does it answer the question? - [Alina] I don’t see anything else in the chat so I think we’ll say yes, unless we hear something more.
48:03 - Thank you. I’m sorry, there was a follow-up.
48:12 - - [Roger] Sure - [Alina] He says, “Thank you.
48:14 - “And for non COVID topics, is it the same process?” - For non COVID topics the same process, but the process would be, we would identify the program office within CDC that is likely to have responsive records and send it to them and say, we received this request, we need you to write the documents.
48:34 - And if they want us to conduct the search, they will provide the names of the custodians whose email boxes we search against it.
48:45 - - [Alina] Thank you. - And sometimes they might come back and say, I’ll give an example of where requester doesn’t identify mailboxes, but he identifies a whole group.
49:00 - So a requester comes in and says, “I want you to search against, “search for all employees within NCRD” “emails for this keyword. ” Well, like Srinath said earlier, we count against email boxes for just a particular program office, right? So literally you’re asking us to search against custodians for entire program office or division.
49:28 - We’re gonna come back to you and say, “You’re gonna have to limit it. ” So if you can’t limit it by name, you’re gonna have to limit it by topic that they will identify with are folks who worked on this particular subject matter.
49:41 - And then they would provide a list of custodians.
49:49 - - [Alina] Thanks, Roger. No other question.
49:52 - - Sure. - Yeah, thank you. And if there are no questions I can move to the next slide.
50:02 - Okay. So I’m going to talk about high intensity search here.
50:08 - And if you look at the picture the person there It’s me who is by and large is here because of the type of request I got, I’m just kidding.
50:24 - (Alina laughing) So typically what happens in an high density search is that we do not…
50:33 - The biggest characteristic of a high intensity search is that the scope of the request is very vague.
50:40 - And we run into probably thousands of records, sometimes 10,000, sometimes 20, sometimes 30.
50:46 - And I’ve seen searches going to 80,000 records.
50:50 - Why do we get such kind of records actually? So if you look at the characteristics here, we don’t have a clear custodian mailbox.
51:00 - defined sometimes. Sometimes we have too many custodian mailboxes defined, okay.
51:06 - And the next is sometimes we do not know the participants in this email conversation.
51:11 - So then we do not know the participants, it’s possible that there could be a discussion with so many people on this particular topic.
51:18 - And it could go to any extent that it becomes very difficult to identify which emails are really relevant to the scope of this request actually.
51:32 - And sometimes we also do not get any keywords.
51:35 - So we have to frame our own on keywords and you have to come up with keywords based on the request.
51:43 - And most of the times when the high intensity search starts, it’s unknown unknown for us.
51:48 - But looking at the request, we can say that this is probably a high intensity search because once we run the search through Office 365, we see all the different volumes of records.
51:59 - Then we figured out that, yes, this is going to be a wild goose chase where we are not going to get too many records.
52:05 - And what really makes it more complex is sometimes we have requests where we have to use Boolean searches like (OR & AND) to contact the search results and records can be very humorous.
52:20 - And finally, even having too many attachments in the emails can also complicate sometimes, busy slides, presentations, which have different words, which absolutely have no relation to the scope of the request.
52:32 - So that is what in high intensity search really, I’m talking about here right now.
52:38 - Any questions related to the high intensity search? - [Michelle] There are no questions on the phone.
52:47 - (indistinct chatter) - I just wanted to add what… .
52:53 - At least from my experience with these high intensity searches I think Srinath was pretty generous when he said that “A record retired is four pages. ” A website could be one page, or it could be as much as five or six pages.
53:15 - So we talked about an email string and that is just the email string itself without, including the attachments.
53:20 - So if it has three or four attachments and each attachment is four pages.
53:26 - You see how that is into being a lot of records, just on its face.
53:32 - And then when we say we located 5,000 records that doesn’t translate to pages.
53:42 - It could be 5,000 pages, 5,000 records could be 25,000 pages.
53:48 - It all depends upon what the size of a record is and how many attachments it contains.
53:57 - And what we had was some requests that would say, at least we’ve agreed with them is remove the attachments, right.
54:04 - So they just wanted all emails and then they would come back, we would negotiate and say, you can always come back and ask for specific number of attachments after the fact.
54:14 - And that could also help with us being able to process your request much more timely if we don’t have to process the entire records that included attachments and everything else, soon it’s over.
54:27 - - Thank you, Mr. Roger, for reminding me of that issue.
54:30 - I forgot about it, thank you. If there are no questions we can go to the next slide.
54:42 - I’ll pause for 15 seconds so everyone read this request it an example of a high intensity search here.
55:16 - All right, so the requester was interested in all responsive records related to procedure guidelines and discussions that happened around coming up with the guidance on wearing face mask to slow the speed of COVID-19.
55:34 - In this instance there are no specific keywords given to us.
55:38 - And we had to come up with a set of keywords.
55:41 - The next is that we had to identify who are the custodian mailboxes.
55:48 - As Roger has already mentioned, we get the information from the attendee, who give us the guidance on what those custodian email boxes are.
55:55 - And the next is the date range which is also coming either from the requester or it’s going to be given to us by the to us by the attendee So in this instance, the keywords were identified as face mask(s), face covering(s), respirator(s) and N95.
56:10 - See this were the different four keywords that we thought we had.
56:14 - So when we do a search here, what it necessarily means is that I have to locate records which are either a face mask or masks, face covering or face coverings, respirator or respirators, or even respiratory, anything.
56:31 - So we do prefix searches as well as suffix searches, and finally the N95 mask.
56:36 - So we had to concatenate a string to come up with the keywords so I can.
56:46 - Any questions so far? - [Michelle] There are no questions on the line.
56:57 - - I can move to the next slide which shows search results actually.
57:04 - Based on the search that was performed here we had come up with 22,877 records.
57:12 - So I’m only talking about unique emails, actually.
57:16 - It’s not the number of pages. So if we do, and before the preliminary insights that we got based on the preliminary search that we did in office 365 and it masks it’s between 12,000 records for respiratory 12,000, face and cover was 6000 so the total records were going to be 23,000 records here.
57:41 - And just keep in mind that these emails were only emails sent by this core users, which were like a high-level officials within CDC, you only had four high level officials, and we are not even talking about emails which are sent to them.
57:56 - So the volume of records here was very high, past 22,000 records.
58:02 - And looking at this results I do know that probably I think with these core mailboxes, and containment that could probably come down to eliminate 40% of the records here and both search analytics and providing here is only primarily at the office 365 level, which probably is around 80% accurate at this point of time because I see so many records, so probably it probably runs this record containment, I’ll probably come down to less than 10,000 not less than that.
58:35 - I cannot get the reminder the cost is going equivalent than that, but still that’s a huge number of records.
58:42 - So 10,000 and has even five pages for each record could translate to 50,000 records.
58:49 - So I don’t think it is humanly possible for any of our analysts, or even the request to go to the 52,000 pages of data and digest this information and comprehend that information and come up with some reasonable analysis.
59:06 - So at this point of time, they simply may I make a determination, letting the analysts know that this is going to be, the scope is too broad.
59:17 - We definitely need to narrow down the requests, and these are the insights that I see and these are the keywords that I see so if the requester wants to make a determination on how the insights look, I go ahead and share all the information with him.
59:32 - So the requester goes back to the analyst and tries to narrow down the scope in this instance.
59:39 - However, in some instances let’s say we come with a few thousand, like 700,800 records, and there are like lots of mailboxes at 15 or 16, then I do know that there’s a potential for a lot of duplicates, so in that instance we do both to the dedupe process and instantly run some mails, and if it is less than 10,000 records then probably we do the shot and we try to go through the academic steps to prepare the data actually.
60:14 - Any questions on this so far? - So I want to say such a simple question, someone asked “If applicable, how do you use new NYX “in high intensity searches?” - Sure so the revenues in high-intensity search and filters is that after the records have been filtered out, we get the first set of data from office 365 search, which is a discovery search.
60:43 - We take the whole data as a CSV file, or even individually email 10,000 we don’t take individually emails (indistinct), and we take that data and put it into mix and they run the same search term that we ran in office 365.
61:01 - Nuix does a better job with eliminating that, and it has a mechanism to eliminate some records, which are probably missed in office 365.
61:10 - And it brings on the number of record comes out that’s the first step.
61:14 - And sometimes it can also eliminate some duplicates, so we definitely see a reduction It just depends on the number of custodian mailboxes and the number of keywords and so forth, so if I make a mistake to say how much percent of news can eliminate which were not eliminated by office 365.
61:31 - The next step has a much more higher analytical capabilities where it uses a analysis, which mailbox has a lot of emails being sent or which domain or which organizations are sending all these emails and which email address is sending us emails which are in the CC, which are in the BCC which are the tool, It also provides us like insight on which data or which timeframe do we see a lot of emails going out, I’ll call them, keep map kind of things, So that’s the analysis of today and as I say that it calls, it gives a subset of data saying that, okay, if you’re trying to use like masks or face masks for these three cables you see around 500 records but when it has its own way of analyzing group, the things of different subsets of groups actually, so that it gives us all the analytics, provides us sufficient analytical information further requested as analyst to make an informed decision to narrow down the scope is all I can tell you.
62:41 - So that is how we primarily use Nuix high-intensity search.
62:45 - - [Alina] Thank you, I have another question might be better for Rodger or Bruno.
62:49 - Someone on YouTube chat asked do they need to request all related attachment if they want attachment? So I think that means would you assume attachments unless you hear otherwise or do people actually specifically ask for attachments when they want? - Great question, unless you say you don’t want attachment then search would include attachments.
63:15 - - [Alina] So the default is yes. - Yes default is yes unless you say no.
63:21 - - [Alina] Great and then the second question came in and had to do with records retention.
63:26 - “How far back, do archives go for file search? “Do they follow records retention schedules “and get destroyed on a schedule like paper file “ordinarily would be and the searches recover files “that have been deleted by individuals “no longer needed, not required to do routine?” - Long question, I’m not a records retention experts, but this is what I’ll say is that we have receive, when we receive a request for documents, but then primarily emails.
64:02 - We do have where someone says, “I’m looking for all email correspondence “from X, starting from 2005 or from 2000” and we can say such a gains that cause you as mailbox, we start from that timeframe from 2000, 2002.
64:17 - If those records still contained within the mailbox, it’s going to be pulled, If it’s not there, they won’t be able to pull it.
64:24 - So we can we pull data that has been deleted from a patient’s mailbox? I don’t and I’m not sure, correct me.
64:32 - I don’t believe that the 365 is described can do that.
64:37 - If the emails are for somebody who is in a capstone program, which is a few folks whose emails are basically archived forever and I don’t mean literally, but pretty much forever, then we can search against any date range.
64:54 - So for example Retro’s emails are archived even though he’s gone, so 10 years from now, it’s only makes it for request for records COVID documents.
65:08 - We’re going to find it because his mailbox, everything in his mailbox was competent.
65:14 - - [Alina] Okay and I’ll just speak for NEHRA and records management, electronic records are scheduled like paper files generally so yeah, that is the truth.
65:26 - Thank you. - Thank you Alina, and thank you Roger.
65:31 - I just wanted to add one statement of this is that there’s a recorded retention policy within CDC and each mailbox is treated differently so I would request retention agencies within the division within CDC, they can clear the direction from how many months that or how many years a particular mailbox can be written virtually.
65:56 - So based on that, we can repeat the request that this mailbox or mail are not going to be found, or if Roger can recap within policy (indistinct) - This is Roger and I just want to add something just on it the Nuix tool, because you don’t have any question about that The Nuix, what it does is that the CDC doesn’t do is that it able to better analyze the data, and so by being able to properly analyze the data, it helps us actually find a needle in a haystack, that is what Nuix is supposed to do, so we don’t use it for, let’s say for data application because the EGR feature could do that, and it’s research it’s more to analyze the data.
66:47 - So for example, what Srinath was talking about heat maps, where’s most of email traffic coming from it, categorize, it’s records, we’ve had Nuix for quite a while but CDC has basically, we have definitely increased our usage of different tools since COVID.
67:08 - And so we still a work in progress and we continue to utilize the functionality of the system, but it certainly is a much, much more robust system in analyzing data than the ADR is, it’s helping us look at records.
67:31 - - Thank you Rodger- - On Twitter, right? - Correct we’ll be monitoring Twitter and so we do have one question.
67:39 - “What does CDC have available or will make available “to help requesters better understand who the custodians will be for particular emails? “Are there all charts or directories?” They said, “It seems like CDC is placing the burden on the requester “to notice. ” - To the extent that we placed the…
68:05 - My position is that in some situations they are quite a few, in other the requester may not know who the custodian are and sometimes they do, as soon as you extend that request, doesn’t know would not know who the custodian are.
68:19 - I tell my team, we should not go back to them and ask them for names of customers because they wouldn’t know.
68:25 - For example, if somebody form the request says, “I want any correspondence “sent by the chief of staff for Governor Cuomo,” this is the business name to anybody in CDC.
68:36 - Well, they don’t have to know who the recipient in CDC is, we’ve been giving you the name of the person who sent an email, right? And so then we can go EOC and say, “Hey, did anybody have any contact with the Chief “or staff of Cuomo?” So, yes in some circumstances the EOC and I’ll say probably the EOC, the EOC is made up of employees who I detailed for a them and they leave so it’s a revolving door.
69:04 - So there’s not the… The people who I knew as you today may not be there 60 days from now.
69:12 - So they continues to changes. So there’s not a list of folks who are there for the entire duration defendant, they’re not, they go on detail for 30 to 60 days and then they go back to their program office.
69:24 - The very few of them stay on for much longer periods.
69:27 - So that is part of the give and take, and so to the extent, and I’m sure this happened and I would own that and apologize for that but to the extent that, head on YouTube and look custodians, we might be doing, say in this situation where we’ve identified that you would know who the custodians are because of what you say, and then you don’t know who the custodians are, then if you probably described the topic wider, then it makes it easier for us to identify the custodians.
69:56 - I mean, if we don’t, if you say “I want all correspondence “about complications between CDC and CVP “with regard to some particular topic, right?” If the topic is scoped enough, we will be able to identify the folks within CDC who had any discussion, but not everyone, but at least they were involved in the discussion.
70:24 - With regard to whether there’s going to be an org chart, Again, I’m not sure an org chart necessarily will be helpful unless you’re talking about the heads of units who don’t change, but even then they changed, I mean, I think that might as over the EOC we’ve gone at least three today, so they change.
70:48 - So I think what is important is be very clear about what it is you are asking for.
70:54 - You don’t have to give us clarity on the custodians, but clear on what it is that you’re looking for.
71:00 - And then we can take it from there and to the extent that even they’re not clear who the custodians are , we’ll come back to you and ask you to refine your ask so that we can identify who was having a discussion about what you’re asking for.
71:21 - - Thank you, Roger and it was a good reminder of reminding everyone of the title Identifying the Needle in the Hay Stack.
71:32 - It felt good actually. And I do see one question here yet, “How do we eliminate duplicates.
71:40 - “and are those that are already covered in the discussion, “we can use any of the tools like Office 365 “or a new study in FOIA experts to eliminate the duplicate , however for containment right now.
71:54 - our capabilities are limited to using FOIA expert containment, thank you.
72:02 - Any questions on this, so far? - [Event Producer - Michelle] We have no questions the line, and that covers the chat for now.
72:11 - Thank you yeah. - So before I move the next slide, what I would say is that any high intensity searches is bond with lots of complexities and a decision has to be made, whether we move forward with it experience, or we hold it back and put it back to the request.
72:26 - So that decision is made based on the number of custodians and the type of humans that we see, the keywords are very generic and so forth, so sometimes there is some discretion when you have to go back to the requests to let them know that they can act on this search.
72:42 - And I can move to the next slide. So I have covered the different categorizations of the searches based on the technical complexities that we have seen so far.
72:56 - The next topic of next section is that, they choose that we see when we perform searches here.
73:04 - So we have made an attempt to identify the problem and the help the end user to know the problems that they fit to see if we can find some solutions and come up with better search results.
73:17 - So the first issue that was really identified was broad scope, the second was high record count, and the third is average data quality.
73:25 - And the three of these are pretty much related and I can quickly go to the next slide.
73:30 - Then I’ll be talking about the broad scope of Research.
73:35 - - I think by now you would, most of you would have been pretty much aware of what the broad scope really means like the characterizations of a broad scope has too many keywords, then having very generic keywords, like cutting on autism, searching on a file or having too many mailboxes.
73:55 - And the date range is very high, sometimes you can click with the date ranges for a few years or a few months and the results are too many results where it becomes really hard for it to identify the projection.
74:10 - So just to put it in perspective, look at the picture there, the scope it’s a sunny day and we have so many umbrellas there, but in reality, we only need one umbrella to identify the request here, and in different scene they yellow umbrella is good enough for us to identify that records and highlight from the scope actually.
74:31 - Any questions related to this topic of broad scope ? - [Event Producer - Michelle] There currently no questions on line - [Alina] Nothing new thank you.
74:43 - - Next slide yeah. And as we have already discussed in the high intensity search you see very high data volumes and it really becomes very difficult for it to identify which is the right data, which is the wrong data, unless the request set is really specific about what he’s looking for and sometimes some request search are very good at telling us what they’re really looking for, but sometimes some requests that come up with some… I cannot get into the request of mine too.
75:16 - I probably read his mind to understand what it really is looking for or what she’s really looking for, that is what makes it complex.
75:24 - So that happens that we get so much of data, and we cannot know which is the real data and so just to look at the picture, we have so much of records there and we do not know which is the right data in the picture, I did that then.
75:43 - Any question? - [Event Producer - Michelle] There are no questions on the line.
75:52 - - Yeah, I can move to the next slide? And I think this is an interesting topic here.
76:01 - So I’m using this term called average data quality, So when we do a search based on the keywords, sometimes we do see record being replied, and then we do analyze the records, it turns out that we know that disregards are not really what the end user is looking for but since the request has not testified that he needs those records and those records we still have to deliver the those records, I can give you an example of a request where we would have to search for records on all mailboxes and the CDC go to Guatemala office and when we did the insertion Guatemala and ICE, ICE stand for immigration and customs enforcement, what they found was that we were getting only emails that the word Guatemala was showing up in the email signature and the word I were showing up in some attached documents, you know, PDF part in a word document.
77:01 - And we absolutely knew that there records that matched what the end user was looking for.
77:06 - So this is what it means, so we have the quantity of the data yet, but the quality is very poor because we are very certain that we are not getting the right record.
77:13 - Sometimes somebody asks for a response for COVID, So when we run a search for response for COVID, we do have a division, not in that specific branch, which is looking at COVID response.
77:31 - So people have their addresses that COVID-19 response, so what happens is all the emails with signatures of COVID response show And I do know that these are not the record that they’re looking for, but I still have to deliver them because these records are what they requester requested.
77:51 - So if it makes sense, what I’m trying to say is that the quality of the searches true because of the keywords that have been provided or because the scope of the request was not really clear, if it makes sense, any questions? - [Event Producer - Michelle] There are currently no questions in the phone.
78:17 - - [Alina] The question on chat are broader we can say save it, or we can take it now, which I think is for you, Roger.
78:24 - - [Roger] Okay, we could take it now. - [Alina] “Okay I think the CDC FOIA annual report “you received approximately 2,400 requests last year.
78:33 - “How many FTEs do you have dedicated to doing FOIA searches “for this number of requests?” - Dedicated do for FOIA searches is one.
78:48 - That’s it , we were working on getting a contractor to assist us, but right now it’s just, is just enough doing the searches.
78:56 - - [Alina] Okay, thank you. - Sure. I just wanted to add, as far as this average data quality in a situation where the key word that has been provided by a requester, is so generic that it’s going to be found in, for example, Eve’s signature, but for example, one way to limit that would be to say the key word should appear in the email content or in the subject.
79:23 - That would narrow it down, I mean so that we go okay if the word should appear in the body of the email or should be in that subject, or it should be I think we can do searches within setting them away.
79:40 - So COVID within five or 10 words of all they’ll say no or some other word just so that we make sure that whatever it is that you’re looking for, right? Because at the end of the day, the requester you are are seeking information that is useful to you and to the extent that we are looking and reviewing documents that are of no use to you.
80:03 - That is a waste of our time, that’s a waste of your time.
80:08 - That was always going to delay our response to you because at the end of the day, you want information that’s useful to you and a lot of times when it comes to e-discovery searches you as a requester can do a lot to help us improve making sure that we have good data to provide to you, right the way you scope your requests and at the extent that you make it easier for us to be much more precise in identifying the documents that are responsive to your request.
80:45 - - Thank you Rodger, If you don’t have any additional questions, we can move to the next section.
80:58 - So far no one is complaining regarding this here, and we have done some analysis and they have got their solutions.
81:10 - We are glad that they have some recommendations that they are willing to share with the end-users and also probably take any inputs or advisors that you have for us so that we can come up with better searches.
81:23 - So hopefully the last section is going to be more intuitive and useful to all of you.
81:31 - So let’s get started the first aspect of improved CEA search when I say very defined scope.
81:39 - So what does a very defined scope really mean? So I categorize this into three different sections, so when I say well defined scope, what I mean is that we do not want any ambiguity in the scope requested needs to be very precise and concise in what it’s looking for, so as long as the request is very precise.
82:00 - and concise, and what is looking for, I’m very confident that we can get very good results.
82:05 - The second is, if the request is looking to perform multiple searches within one single search, that recommendation is that, he split each search into its individual line item within the search request, it would be better if each sub search is really spoon individual request that way each search is very focused on an object you have what they are looking at you that really helps us out actually.
82:34 - And the last thing on recommendation is that most of the searches that I have observed is that it is not the use of letters and subscriptions that come in and we do see lot of requests explicitly stating that we do not need news letters and subscriptions, then we are looking only at conversations and things like that.
82:53 - So that is really appreciated then we have this three or four items taken care of when the scope is really very defined, it makes the search much more predictable.
83:04 - It saves a lot of time and that’s what I did at state.
83:10 - It provides much more productive results and it helps give the requester get the right data.
83:14 - Any questions on this? - [Event Producer - Michelle] There are currently no questions on the phone.
83:24 - - Thank you. - [Alina] And no new chat questions, thank you.
83:28 - - Thank you, let me go to the next item on this limiting keywords.
83:34 - So when I say limiting keywords, what I mean is that it’s a list, if sometimes we do that request, the request to give us keywords and say we want to use some keywords.
83:44 - You give us one subset of keywords, another subset of keywords in that subset or that subset and or in and and this and that.
83:51 - So what happens is that when we have multiple keywords coming in, I absolutely know that the search results are very diluted and we are getting a much more generic and subset of data.
84:06 - So it’s going to be and needle in haystack here.
84:09 - That’s for sure. So if the requests time can be very concise, a full-size saying that “I’m only looking for this keyword or this keyword” that really helps us in narrowing down the search.
84:21 - And the biggest recommendation I would say is that rather than using an and or search the second recommendation is to go with the free search.
84:30 - I can put an example of a free search, So we didn’t get a request that asking for a testing for COVID-19 in long-term care facilities.
84:43 - So that is a very good phrase, but it doesn’t necessarily mean that when I search for this phrase, I’m going to get any records or all the records, because people get these different words that probably they can rephrase the content of what they’re looking to search in different ways.
85:01 - So what we figured out is that testing for COVID-19 in a skilled nursing facility.
85:09 - So the way the search was performed was rather than say testing we say test, test start so the word is a suffix.
85:17 - So it detects testing or testing, that is one there.
85:20 - And the next is looking for testing within five or ten words to the reference of COVID, or it could be in in COVID-19 or Corona, things like that So testing is within COVID for search in COVID-19 or Corona, and also, additionally, the (indistinct) a long-term care facility or long-term care facilities skilled nursing home, and things like that.
85:55 - So we just need to get creative with words and try to come up with clear search try and capture those words and what my observation has been that rather than doing an all search of COVID-19 and skilled nursing facility and testing, then we did this free search trying to find words within a number of words testing, we were able to get much better results, which are much more accurate.
86:27 - So that is one thing that is specifically recommended in and or search actually.
86:33 - Because an and or search gives us an email with a thousand pages.
86:38 - The first word could start at the starting of the body of the email and the last word could be somewhere in a contained in an Excel document, or it could be a Word document.
86:47 - So that record may not be relevant actually.
86:50 - So that is one thing we can eliminate such things when we try to do a free search.
86:56 - And the third thing is that let’s say the end user is coming up with keywords, it will always be better if they can prioritize, which key word takes precedence So if they’re doing a three keywords I recommend them doing a priority, this is the first keyword that takes precedence and second less precedent, and third less precedent.
87:13 - Because when we run this records and we give to many records and you run it into Nuix it gives us a sub set of the records to us and even in the specifics can’t help us present the information to analyst saying that okay, I see what you’re saying this record, and this is taking all precedence, so if you want this precedence, we reviewed this subset of records.
87:33 - I’m trying to help they users Summarize what is they’re really looking for, rather than having keywords like and so forth.
87:40 - So that is one thing that really helps us targeting the keywords, doing a free search and giving the keywords to as minimal as possible.
87:48 - Any questions on the limiting the keywords? - [Event Producer - Michelle] There are currently no questions in the phone queue.
87:58 - - This is Roger, I wanted to say something here because I want to make clear to everyone who’s listening, that there is no requirement that when you submitted your quest that you have to provide us with the keywords.
88:09 - So this example would be, if you do provide us with these keywords, limited number of keywords, ‘cause we’ve had two page, sometimes folks give us a whole page of key words or two page keywords.
88:23 - So, what is important, one of the most important thing that you can do is to have a well-defined scope, right? If you have a well-defined scope we will be able to find like Srinath was saying the key way that you might use might not be the term that internally that folks who were having the conversations would use.
88:45 - So you might say long-term care and maybe they might use the name of the facility, or they might just say LTC or whatever it is.
88:54 - So if the scope is well defined, that’s a very good search.
88:59 - If you want to provide key words, you can, limited number of keywords, but you’re not required to give us keywords.
89:07 - You also not required to give us custodians but if you do want to give us a list of custodians limited lists of custodians, because the more custodians you provide to us the more records are going to pull the more duplicated records are going to provide because if they are 10 or 15 custodians, and all of them are CC or participants in particular discussion, that means that one email string is going to be contained within 15 or 20 custodian email boxes.
89:38 - And so just point of clarification, you don’t need to give us keywords, you don’t need to give us a list of custodians , but if you do just admit it.
89:52 - - Thank you Rodger. It was very useful information and a very good reminder, and I can move to the next item, which is avoiding generic keywords.
90:03 - So when I say generic keywords, right, I do see a lot of…
90:07 - I can give an example, I see a lot of requests coming with autism and I had one requests, we were asked to search on and requests on and custodians mailbox, who is a researcher on autism.
90:21 - So when we did a search on his mailbox, all his emails, were a lot about autism so we came up with 30,000 records of autism based emails within the span of three months, It’s like trying to threaten stock brokers email with the word stock.
90:38 - So that’s was the type of request which is very generic in this instance, the recommendation would be to, if you are giving some generic keywords, please also provide some supplemental keyword that will help us narrow down the search, so if somebody sending us autism and searching the mailbox of an autism research.
90:59 - Probably there is a medicine or there’s a condition, which is causing that, so some something which can narrow the results or something which is more specifically , that they are on a subset of records with the autism that they’re looking for so that really helps us out in the longer.
91:14 - And next is as an example is about meat processing plant and guidelines and things that.
91:22 - We had lots of keywords that it’s generic and things like that, so I’m just giving example of if you give us generic keywords, so to make sure that you throw it at least one supplement of keywords to narrow down the results.
91:39 - I can go to the next slide which is to limit the number of custodians as Rodger has already gone over it, the more the number of custodians that you’re going to have, they’re going to have more number of emails and more number of duplicates than we need to go through.
91:54 - So I’m hoping that I don’t need to do that again and again, so the lesser number of custodians they’re going to get the lesser number of records and it becomes a lot easier to narrow down the search request And the last item is reducing the time span of this searches actually.
92:13 - So sometimes I do see requests coming in four year time span or five year time span we find records and sometimes we don’t find records because of the records, they didn’t change policy, which is very different for each mailbox.
92:28 - However, we do notice that sometimes as we run searches like for a year or a couple of month we get like 20,000, 30,000 records so it’s always better like limit a times tamp, be very specific on this time span you’re looking for.
92:43 - If there is an event that happened, probably a month, or even 15 days before that event 15 days before the event.
92:50 - Probably there’s a lot of noise that he’s not just predictive activity, or let’s say an example is a quarter three.
92:58 - (indistinct) for like a month or two. So if that one month to two months of time span in there, it really helps us identify the right time of record actually.
93:09 - So these are some of the improvements actually, which will really help us get data for user request actually.
93:18 - And if anybody has any questions for me, I’m willing to answer them related to this topic, on this section.
93:29 - - So Srinath this is Alina, we have a question actually from our side for Rodgers, which you can be able to and this is possibly a question also for Rodger and Bruno, “Would you be able to talk a little bit about “the role of the FOIA public liaison “and whether when a very broad search is submitted “is the requester able to reach out to the public “who would be willing to help requester strapped “up a well scoped request?” - Yeah sure Alina, and I’ll give the question to Rodger Roger do want to answer the question? - Sure, so the CDC, yes, I’ve had requested his contact, the FOIA public liaison, which I think now it’s Bruno’s part of Public Liaison or they reach out to me I’m more than happy to work with request is the scope of the request.
94:32 - But at least from where I sit, it is much more advantageous for them to work what will you do when we get it assigned to an analyst and then analyst handles that case from creative degree.
94:43 - So at some point in the process, I’m going to see that request, review the request, and then it gets released.
94:51 - Oftentimes the person who knows the day-to-day, the in and out ,this means who has more details about the request would be the analyst.
95:00 - So my preference would be first out of waking with an analyst, if there’s an impasse and then you have to escalate, it, I’ll be more than happy to jump in.
95:12 - But I think if you start with an analyst and most times I think in most situations, they are able to work with a request is to reformulate their request in a way that is satisfactory to both sides.
95:28 - Sometimes we have an impasse and sometimes they may have an impulse with me.
95:34 - It’s just does dependent upon what you’re asking for.
95:37 - So for example, it’s the only thing that I want you to do.
95:39 - I say that against all the email box boxes by particular program or division.
95:44 - We going to have an impasse because I’m going to say we kind of stacked against three or 400 custodians because I cannot push a button to do that.
95:54 - He’s going to have to manually put in every single email box for every single employee and definitely a division that right there would be an unreasonable request, and it’s going to take unreasonable amount of time, so soundly, yes, you can contact properly.
96:14 - You can contact me directly. You can contact Bruno to help you.
96:17 - We’ve only request, but the best thing you should start with would be the person that sent you a request and that person’s name is always in your acknowledgement letter that you receive.
96:27 - So you have the contact information of that person in your acknowledgement letter, and it’s best to start with that person.
96:37 - - Okay, great thanks. I think we have another question on the chat, yes.
96:42 - “So it was explained earlier that containment tools “pull the last email string, “however, what happens if multiple strings “are created with recipients of theses added or dropped “and conversations going in multiple directions? “Well, the programs keep those break off strings, “or will they be eliminated by the program?” - That’s a very good question, and I can answer it for you.
97:07 - Yes, so if there is a breakage or somebody changes the content of the email or adds a new recipient and deleted everything that chain is broken and it still happens that another record is created.
97:22 - But when the analyst look at the record here they make sure that sometimes if it’s the same thing, if it’s part of the containment, we can go into the record, but it does break if the chain is broken, it does create a new record actually.
97:36 - So the containment will not work for that particular instance here.
97:42 - - Yeah just to amplify what Srinath said, so if all the email correspondence was not all contained within one email string than any separate emails are completes every record.
97:58 - They’re not going to be eliminated. - [Alina] Okay and then we had follow up, I think, of the same basic topic, “Could you please discuss the topic “of the most comprehensive email threads?” - Let me attempt to answer that question.
98:18 - I’m going to assume when you see the most comprehensive email said, you were saying that email that contains every single email correspondence, about a particular topic.
98:29 - So if that exists, because sometimes it may not exist, right so if I send that email thread contains every single discussion about the particular subject divider? Well, I would assume that is the most comprehensive and then to the extent that the containment system identifies that, then it pulls out records so that the requester is receiving every single temptation of discussion about that particular department.
98:58 - But then that one email string is not comprehensive.
99:02 - and there are maybe multiple ones that are subsets of even going in different directions, then those are going to have to be pulled.
99:13 - And they’re not going to be considered as dupes or near dupes, ‘cause they’re not.
99:19 - - [Alina] When you say they are going to be pulled you mean days will be part of the responses.
99:22 - - Absolutely, they will be part of the response, exactly.
99:27 - - [Bruno] And I want to add onto that as well this is Bruno Viana, at the CDC from my experience using the tool and Srinath, Roger, you can back me up as far as the duplicates and the containment is concerned.
99:39 - That tool is very sensitive so I’ve had analysts come to me and say, “These are duplicates why is it not catching it” but any sort of change here or there if there’s an attachment missing, if there’s, if it’s a forward, if there’s any slight change, the tool is very sensitive and it’ll include it in a responsive documents.
00:01 - - So let’s give an example, for example, let’s say Rodger, Bruno and Alina had an email conversation about having this webinar, right? And so we have emails back and forth and they said, there’s a final emails that have this discussion and then I forward to Srinath there and I just do FYI, or as long as I say anything.
00:24 - I just forwarded Srinath the whole email string, not of introducing that that’s a separate chain.
00:30 - ‘Cause he was not part of our conversation.
00:31 - I just forwarded the whole emails string between myself , Bruno and Alina to Srinath that is, we no longer have one composite of email.
00:42 - We’ve created two separate ones now. - Yeah, thank you Roger and thank you Bruno for reminding this.
00:53 - And just add one thing to what Bruno said is that if somebody even tries to add a single line break within the email chain and forward it to somebody else, it can create another channel to get it.
01:08 - So we’ll I end up having the same content or the same scope but as Roger has said, multiple subsets of data and offer the same, but from the same information - [Alina] Thanks everyone, I don’t see anything else in chat right now.
01:25 - - Thank you and I would like to open this up to anybody within the user community who is willing to provide us any recommendations that can help us.
01:39 - So we can probably have a few minutes of chat.
01:41 - or discussion to see if they have any suggestion for them.
01:45 - And we can take in suggestions and have then discussed internally within the CDC office.
01:51 - - [Event Producer - Michelle] Ladies and gentlemen, if you would like to make a comment of over the phone or you have a question you may press pound two on your telephone keypad to enter the queue.
02:09 - - I think you’ve done such a great job answering questions as we’ve gone along that everyone has been on silence at this point, but we’ll give everyone like a couple of minutes to absorb and sure enough, I don’t know if you want to ask Michelle or go to the next slide where the contact information is there.
02:36 - - [Srinath] Yeah sure. - [Participant] This is (indistinct) on the chat I said that the information session was very helpful for understanding you and in order to work together.
02:50 - So thank you. - [Roger] Well, thank you very much.
02:56 - - Yes so if anybody has any additional questions, please feel free to reach out to me related to any technical aspects, but it is related to any business administrative aspect I recommend that you reach out to Roger or Bruno and this should be done with the question.
03:13 - - [Alina] Martha, do we have any other questions on the YouTube chat platform? - [Martha] Nope nothing from our colleagues who are watching the chat right now.
03:25 - Thank you. - [Alina] I just saw another chat question come in.
03:29 - “Does the CDC have the analyst “to do manual responsiveness checks “to further reduce duplicate emails “slash attachments within threats?” - Yeah, I will raise this question to Roger and Bruno - [Bruno] I’ll take that one, sure.
03:48 - So this just the first part of pulling the records and deduping and doing all that.
03:53 - And every set of records that strap poles is going to go to an analyst who is going to analyze it, you know, go through the process, the records before it’s released to the requester During that process, if they are seeing duplicates, because it’s not perfect I mean, at the end of the day, it’s a computer, whatever you put in is what you’re going to get out.
04:11 - So you still need that human eye to look at it to make sure that, you know, everything is still responsive or you know, we didn’t pull a bunch out of scope stuff for one reason or the other, so, yes, There’s every package that goes out a person will still look at it and do that analysis and they look for duplicates.
04:28 - and again, as much as a computer isn’t perfect, we are too.
04:33 - So there may be duplicates that we missed. But we take all the effort in the world to make sure that we catch those and not just for the requester, but it also, you know, it’s easier on us If we can catch the duplicates, it’s fewer pages that we’ve got to go line by line and review.
04:49 - So it helps us out as well so we definitely do that.
04:53 - You know, review after certain process. It definitely goes through another review before the release - I wanted to add that in addition to the analysts who’s assigned to review it when I’m reviewing a COVID record so going through corporate records, I’m also looking for everything since I sent the C duplicate emails of the same that are contained within a combination is great.
05:19 - If I could either say flagged as a duplicate, I might just leave it in, but I have to make sure that through a process is consistently that’s if they guys didn’t have to watch for, is that separate email that is contained within a company’s email is not processed differently from the comprehensive one.
05:36 - Like I have to make sure that that’s done accurately.
05:39 - And as far as the attachment goes that’s a little bit tricky when it comes to we’ll talk about attachment is a duplicate, right? If I have email events, email correspondence between CDC officials and they attach a document, right? This is CDC’s please review and edit CDC, school guidance, for example, this is that for example right? And then that same school guidance edit is sent by let’s say Dr. Willinsky and she sends it to let’s say the white house and says, “This is our current draft of the e-day school guidance.
” I can’t say that just because we have released it in this email string internally, it’s the same thing that’s a duplicate, no it’s not.
06:34 - Because that email chain to the white house is every email the attachment is to that email string.
06:41 - Therefore, the document itself is not a duplicate it’s included, even though it’s the exact same document that internally, (indistinct) was given too by (indistinct) It’s the same document, but we’re not going to mark that as a duplicate, just because it’s the same document attached to a different email, it’s not.
07:04 - So when we talked about removing attachment of that duplicate, it means that the email and the attachment are the same, so everything should be the same otherwise it’s in.
07:17 - So the email string is the same but the attachment is different as a new record If the email is different, and the attachment is the same as we’ve seen earlier, it doesn’t matter, it’s still a different record.
07:38 - - [Bruno] This is Bruno again, this goes back to the question that Rodger answered at the beginning of the presentation.
07:42 - So in the FOIA world, it’s considered the email and the associated attachments are considered a record.
07:48 - So that’s why the default is if you make a request for emails, those attachments are going to come, unless you say that you don’t want them, you know, then we can exclude them.
07:57 - But you know, a record in this instance is that email and any associated attachments.
08:02 - So that’s why, even though the body of the email is just a forward, or it looks the same the attachment is the same, there’s no changes made to an attachment, but Rodger sends me a draft of you know, a document to five different people, it’s gonna go to five different people, the attachments the same, but the text may be different, you know, if it’s forwarded or replied, but there’s no changes to that attachment.
08:32 - - [Alina] Martha, I think we have another couple of questions.
08:36 - - [Martha] Yes. So this is getting to communication between the analyst and the requester, “Will the analyst reach out and “say ‘Your request is probably high-intensity. ’ “Can we talk about scope to get it to moderate or low? “Will you do the search first “before you determine that it’s high-intensity. ” I guess the question is when a request comes in, you know, is there always a search conducted or can it be determined to be high intensity before the search is conducted? I think that is the question.
09:06 - - Yeah this is Rodger, I think at least want to experience some requests on this face will be a high intensity search without you having to do a search.
09:17 - But in some situations I’ve asked my staff to go before you go back and say, this is overly broad or big, or voluminous, we need to have data to support that, right.
09:30 - So we should do a preliminary search and see what we pull, because it may turn out that there’s not much discussion here And sometimes we might do the second and realize, Oh, okay, there, wasn’t a lot of conversations around this subject matter.
09:43 - It seemed broad on its face, but there wasn’t much conversation here.
09:47 - So but to the extent that, so if we do the search and then we determine if it’s a high intensity search, then that would make that known to the analyst and now this would go back to their requester that with enough information to help them to through the request.
10:01 - But sometimes on its face and I go back to this one about, I want all correspondence that the CDC had with, for example, the white house.
10:10 - Okay any consider that CDCI with the white house, from January 1 2020 through December 31st 2020, all the experience is going to be high intensity search because they got me multiple people, they go to multiple email and domain names that’s a high intensity search right on its face And we don’t need an extended research to tell us that.
10:31 - - [Alina] So it depends is the answer. - Yes exactly, exactly.
10:36 - - [Martha] One question that someone had regarding duplicates “If the recipient changes, “but the email thread is identical.
10:45 - “The thread containing a different recipient “would be contained as a non duplicated is that correct?” - That’s correct.
10:52 - - [Martha] The content is exactly the same, but you gotta- - It’s a different email yes.
10:57 - - [Martha] Okay, I don’t see anything else in the chat right now, unless I’ve missed something Alina.
11:04 - - [Alina] No, I don’t see anything else either I think he’s answered all the questions.
11:09 - Michelle, any one wants to chime in orally on the phone? - [Event Producer - Michelle] No, I do not see any chat questions or comments on the phone.
11:19 - - [Alina] Okay Srinath any other wrap up words before we say goodbye to everyone and let them get on with their day. ? - Yeah, sure there Alina.
11:29 - I’d like to wrap this up by saying that what I like about the situation of the new tech, as long as this could be finalized and the scope is laid on site, I think the biggest take away from this session would be that if the requester can provide us with the right scope it makes their life and our life easier.
11:53 - And I thank you all for giving me this opportunity to present at today’s sessions and thank other partners, for giving this opportunity for me to present this information, then hopefully this is an helpful session and it helps us to even cut down on our purchases thank you.
12:16 - - Thanks, Roger and Bruno, and another parting thoughts before we say goodbye to our folks? - [Roger] Bruno do you wanna go first? - [Bruno] Sure, I just want to say thank you again to Rodgers and I would recommend any other FOIA offices you know, reach out and use their services as well.
12:35 - They’re great about advertising events and organizing, running them, moderating, you know, doing all the work, so they make us look good and we do the easy part.
12:43 - So we really appreciate that. - Yeah I also would echo in that and I would encourage any federal agency that’s listening in to take out to take advantage of the opportunity that which is given to us, to communicate with your requesters about your FOIA requests, and I think the more we can communicate and the more we can let requesters know the challenges that we have to go through, or what we have to do.
13:17 - I think the better it is for all of us and I want to say at least on behalf of CDC and FOAI office and the agency is that we take our job being respond very seriously.
13:28 - And we work tirelessly, I have to say that we work tirelessly every day to make sure that we get responses to your requests.
13:37 - Are we perfect? No Are we close to being perfect? No But we try hard every day to get there, and this is part of what we trying to do is to hopefully get FOIA questions to understand that they can help us make that goal of getting responses to them as time goes by, thank you.
13:58 - - Great message Rodger, during public service recognition week (chuckling) I think we work all tirelessly of government employees.
14:05 - Well, thank you all very much, Srinath, Rodger and Bruno.
14:09 - You’ve all done a great job of covering a lot of important material.
14:14 - I think everyone will find it very helpful they have your contact information if they have any up questions.
14:20 - I want to thank everyone for joining us today, I hope everyone in the family remain safe, healthy and resilient.
14:27 - Take care of everyone and have a great day.
14:30 - Bye - Thank you all bye bye - [Event Producer - Michelle] Thank the end of the conference, thank you for using the rent services.
14:38 - You many now disconnect. .