Alternatives to PDF
Jul 16, 2021 15:53 · 13243 words · 63 minute read
DAN COMDEN: –wanted to do any introductory announcement? TERRILL THOMPSON: I don’t think so.
00:08 - I’ll let you take it. I’m going to go off camera.
00:26 - DAN COMDEN: All right. Can I get a verbal yes on my screen share? TERRILL: Yes.
00:33 - DAN COMDEN: Awesome. Welcome, everybody, to today’s session, part of our webinar series through accessible technology services.
00:41 - My name is Dan Comden. I’ll be joined later by Gaby de Jongh and Terrill Thompson.
00:51 - What I’m talking about today are alternatives to PDF, and a subtitle or alternate title to this might be “Anything But PDF. ” So we’ll be making the case for the problems that are inherent with PDF and then also looking at solutions.
01:11 - I learned a long time ago you can’t just show up with problems.
01:14 - You have to have solutions, as well. I want to acknowledge my sons, who often make fun of me and some of my approaches to technology even though I’ve been working with it for a long time.
01:27 - I will acknowledge that I do have a certain age component to some of the things that I’m saying, but I think in this particular instance, what we’re talking about is applicable to all age groups.
01:43 - For those who can’t see, it’s a meme on the screen right now of an old painting, somebody working on a loom and somebody younger working, a couple of kids below them.
01:55 - The person at the loom is saying, “Industrial Agers are ruining the country. ” And the child is saying, “OK, loomer. ” So we’ll talk a little bit about documents.
02:07 - What is a document? I think everybody has an idea of what it is, and just from the Wikipedia page a pretty solid definition of this.
02:20 - But I want to highlight the electronic matter– a piece of written, printed, or electronic manner that provides information or evidence– and it goes on from there.
02:29 - So we’re going to talk about, of course, electronic matter and the different kinds of electronic documents– pardon me– that we work with while we’re working in this online world.
02:42 - So we’ve got word files, of course PDFs. We’ve got plain text and almost plain text, which are TXT and RTF formats.
02:53 - PowerPoint files, EPUB is an up and coming format, and then of course there’s all kinds of other proprietary file formats.
03:05 - We’re going to concentrate primarily on HTML and why HTML sort of rules this space, and we’re going to ignore audio and video documents for today.
03:20 - So I really want to concentrate on the types of content that we consume online, and primarily that is text.
03:29 - Text is still the ruler of the digital information space.
03:36 - We also have images which can be maps, technical drawings, charts, graphs, and so on.
03:44 - And then we also have STEM content, where we’re dealing with things like equations and formulas and statistical information.
03:54 - All this, of course, can be combined into the formats that we talked about.
04:00 - What I want to talk about today is sort of a hierarchy of the accessible learning tools that are available to us.
04:09 - So we’re in higher education. We’re talking about learning, primarily, or sharing information.
04:16 - And so what I want to talk about is this hierarchy with regards to promoting document functionality, which I call HALT PDF for short.
04:29 - And that, I guess, could be another alternate title of today’s presentation.
04:36 - For those of us who work in the accessibility space, PDFs are a persistent ongoing irritation– problem, challenge, what have you– to everyone that’s involved with either creating them, fixing them, or consuming them.
04:54 - So there’s a– and this is based on my experience and observations over 30 years of working with different file formats– talking about things that are inherently accessible.
05:07 - This is assuming that these file formats are properly done.
05:13 - HTML really is the best. Structured Microsoft Word files are second, and then below that we get things like RTF, which has a little bit of structure but really not much, or plain text, which has no structure.
05:29 - And then way below that we’ve got things like PowerPoint and PDF.
05:35 - And again, this is based on our observations of electronic documents– just not only in the Canvas platform, but just on web platforms in general.
05:48 - So PDF is an acronym that stands for Portable Document Format.
05:55 - I’ve seen and come up with some other ideas for what those letters can actually mean.
06:03 - It’s a pretty yet it’s a dumb file. Probably doesn’t flow is another one for the PDF.
06:11 - I think really the most accurate descriptor of what PDF is, though, it’s a print description format.
06:18 - It really is designed for documents that are going to make their way to a piece of paper.
06:26 - And a piece of paper is a very different thing than a screen or a monitor, or even an audio experience through a screen reader.
06:40 - PDFs really are made for print, and they’re good at that.
06:44 - So don’t get me wrong. I don’t think PDFs are useless.
06:48 - I think they do have a place, but I don’t think they have a place online, for the most part– for the most part.
06:58 - So we’re going to look at some numbers, and these are based on recent and historical interactions with our friends over in the Disability Resources for Students Office.
07:12 - They have quite a large team that are involved with fixing PDF documents, making them accessible for their students with disabilities that they serve.
07:26 - And I wanted to call them out for working with us on generating some of this stuff.
07:35 - I will say that over the past year, of course, everything’s different for everybody, and it’s been different for them in their office, as well.
07:44 - So some of the numbers from the last 12 months are out of sync with what we’ve observed over the last 15 years or so.
07:55 - But some of the numbers are the same as far as the number of students that they’re working with.
08:02 - So they’re working with just over 1,600 students that are receiving services.
08:09 - I’ll also point out that based on research and survey responses that we’ve seen here and at other institutions, we know that that count is low, by perhaps as much as 50%.
08:25 - So we might be dealing with well over 3,000 students with disabilities on our campus.
08:32 - But we’ll go with what’s official now, so 1,600 that are registered, and approximately 10% of those over the last year are requesting document remediation.
08:44 - And again, that number is down. But also just the number of students in general at the University of Washington, which I just found out, is also down over the past year.
08:57 - So again, these recent numbers are a bit skewed due to the COVID-19 situation, but we can go back.
09:07 - I’ve got some information from prior years, and we have no reason to believe that the trend has changed, apart from the situation of students not being on campus.
09:21 - So when I say the prior year, I mean from summer 2020 to spring 2021, which just wrapped up– over 2000 requests for document remediation through DRS, and the average number of pages is about 20.
09:40 - So we break things down into pages. As we’ll find out later from Gaby, looking at it per page is more important than looking at per file.
09:54 - So just over the past year we’re looking at about over 40,000 pages, and by far most of those are PDF files.
10:04 - So keep that number in mind as we carry through on this.
10:10 - Going back to some numbers provided a couple of years ago, we see that there was pretty much a growth in requests for document remediation.
10:25 - I’m not going to read all of these tables here, but I will say they just show that there was a growth.
10:32 - One of the first things that we look at for PDFs and whether or not they’re accessible is something called text selectability.
10:41 - So can you, as a sighted person, use your mouse and highlight individual characters within the document? If you click on the page and the whole page becomes selected, then we know that that document is completely inaccessible because it’s just a picture of text, and not actually text.
11:00 - So the numbers skew a little bit for this based on our information from six, seven years ago, but we still see it’s a significant number of things that are entirely inaccessible.
11:20 - And then going past text selectability, we want to look to see whether the document has structure, so whether there’s tags or bookmarks that have been inserted into the document.
11:32 - And we see that those numbers are very, very low.
11:35 - So even for documents that are somewhat text selectable, they don’t have any structure.
11:42 - They’re just a giant gob of text– over 2⁄3.
11:47 - And there’s no reason for us to think that that has changed, again.
11:55 - PDFs are expensive and a lot of times, the cost of PDFs is really not borne by the individual or department that is producing them.
12:09 - Somebody has to fix those if a student with a disability, a print disability, needs to be able to use their assistive technology tools to listen to the text within a PDF.
12:23 - Somebody has to make that fixable. Right now, that’s viewed as an accommodation by the Disability Resources for Students office.
12:34 - They’re the ones that are doing this work. So essentially, they’re taking products that were made by other individuals and departments on campus, and fixing them.
12:47 - And I think we can maybe have a whole side discussion on whether or not that should be the responsibility of a single entity on our campus.
12:56 - I would argue that the answer to that is probably not.
13:02 - Remediation of these files can really vary quite a lot.
13:06 - If it’s a simple document with not very many pages, we’re looking at about a minute per page to remediate.
13:19 - To fix, right? To make sure that the text is selectable and detectable and also has some structure to it.
13:28 - The moment we start dealing with more complex things like tables or images or math or science, that number quickly climbs, so the number of minutes per page to remediate can really go up.
13:48 - So looking at our 40,000 pages over the last year, if we multiply that out by the other Research I schools– of which there are 130 in the United States– we’re looking at over 5 million PDF pages that are getting fixed every year nationwide.
14:13 - That’s just the R1 schools. So keep that number in mind when Gaby talks about remediation costs, over 5 million.
14:26 - The number of public University colleges, 2-year and 4-year schools in the United States is about 1,600, a little over.
14:35 - And the total number of higher education institutions as recognized by the National Center for Education Statistics is nearly 4,000.
14:44 - So we could even go further out, but of course, enrollment is not uniform across those.
14:57 - So we’ve got these costs to fix PDFs, and one of the hidden costs is you need to have special software.
15:08 - I like this image of Adobe Acrobat Reader that I found a while ago, which is an image of a gymnast holding a book and reading, that’s your Acrobat Reader.
15:20 - That tool will not fix PDFs, if you’re going to fix them, you have to have Acrobat Pro software, and that software costs money.
15:31 - The Acrobat Reader software is free, but not the software to fix things.
15:37 - So then you’re looking at either doing it yourself, or you’re going to outsource it and outsource it on campus or outsource it off campus, that’s really the question.
15:49 - But really, the cost is time. We’ve got Grumpy Cat on the screen, “Time is something I don’t have for you. ” We’re all pressed for time.
16:00 - I would argue that pushing all of that time off onto a single campus department is perhaps not a reasonable approach to dealing with the PDF problem.
16:13 - PDFs also offer a risk when it comes to the Office for Civil Rights, or the Department of Justice.
16:23 - Dealing with complaints about accessibility and the inaccessible documents are a common theme that is woven in many into many of the complaints that OCR and DOJ receives.
16:40 - And one of the first things they do when they’re evaluating a University is they can go online and they can just look and see what is the online presence of that school.
16:53 - And so they’re going to look at everything, and that everything does include PDF files.
17:04 - So we don’t want that risk. Let’s look at it from another direction, let’s not look at everything as a threat or risk, let’s look at just making the experience better.
17:15 - How do we read our content digitally? We’re doing it on screens, right? We’re doing it in front of a computer, and that computer could be like the computer I have here in my home office.
17:29 - But we’re finding, increasingly, the computers that students are using all the time and prefer to use are their handheld computers or their mobile devices and laptops.
17:45 - And what is on the screen now are a couple of photographs.
17:49 - One is an image of the New York Times newspaper on a tiny little screen.
17:56 - And yes, many young people who have vision do have good vision and can read that small text.
18:04 - That ability doesn’t usually stay with age, but it also is a challenge to read for just about everybody.
18:15 - A lot of students also use laptops, not all those laptops have big screens either to view this.
18:22 - And we’ll talk a little bit about some of the user costs of dealing with PDF files.
18:29 - So when you have to disrupt your browsing session to open a PDF file that contains primarily text, you’ve just lost all of your navigation.
18:42 - You’ve lost your browser experience by having to open up that other piece of software, so that that navigation goes away.
18:53 - It’s not impossible for a user to bookmark information in a PDF file, but it’s very challenging.
19:00 - And it’s not a feature that’s built in to the PDF experience.
19:06 - And by design, that file format is not made to be editable.
19:13 - Even editing text with the powerful Acrobat Pro software is not an easy experience, and Adobe’s just not done it.
19:25 - So the PDF format, which has been around since– I think it really started appearing in the early ‘90s.
19:33 - Nearly 30 years, Adobe still hasn’t provided that functionality and just in general, the company that was initially responsible for the PDF format.
19:45 - I will point out, a lot of people think it’s proprietary, it no longer is.
19:49 - It’s an open format as of 2008. But the primary tools that are used to deal with PDF files still come from Adobe.
20:02 - As an aside, the Microsoft Word for Windows format is only about five years older, but it has quite a bit more capability.
20:11 - Just inherently, as far as accessibility. So again, going back to the slide, the user costs.
20:19 - No bookmarks, no navigation– unless the file creator has inserted the navigation in there in term of what are PDF bookmarks, which are not the same thing as user bookmarks.
20:35 - So getting text out of that may or may not work, it really depends on how that PDF was created.
20:42 - Again, if it is an image of text, it’s very difficult to get text out of it on the user side of things.
20:51 - In my mind, one of the biggest, most serious problems with PDF files is that the text doesn’t reflow.
21:01 - So if you’re looking at it on a smaller screen, you end up having to do a lot of horizontal scrolling which is very, very difficult to do and retain your place in the document.
21:13 - I think, if nothing else, that lack of a reflow really is the stake through the heart of the PDF file format.
21:25 - That lack of reflow, and also, a lot of people don’t understand that students with disabilities– people say, all you need is this extra reading software.
21:34 - Well, students with disabilities often are already using extra software, and we’re just layering them up either with browser plug-ins or additional applications.
21:49 - They already need that. In the case of students with visual impairments, they’re using magnification or they’re using screen reading.
21:58 - Many students with other print disabilities that do have vision are using TTS or text-to-speech software.
22:06 - So that’s a lot to ask, to pile on the cost of the PDF file.
22:14 - So all that said, students often still are asking– when they’re asked, what file format do you want this fixed file in? They’re asking for PDF files.
22:24 - And they don’t know about other ways to get information, and so they’re still asking for PDFs, and so it is a question of training for these students.
22:35 - And a lot of them will say, “I know what I like,” I would counter that with, “They like what they know. ” And all of us have a tendency to do this.
22:44 - It’s not just a disability-related thing, it can be hard or even feel a little painful to make a change, but we need to be thinking about what the best experience would be.
22:59 - So we’ve got a rule for ARIA. For those who have been to some of our other webinars, ARIA is a tool in the HTML world for creating accessible internet applications.
23:10 - And the first rule of ARIA is, don’t use ARIA when you don’t have to, if there already exists HTML things for it that will do the job.
23:21 - And I would like to extend that and have that also be the first rule of PDF, don’t use PDF when HTML will work.
23:29 - Jacob Nielsen is a well-known researcher in the usability space.
23:36 - He’s in an article going back, I think, to 1996.
23:40 - He talks about his first law of computer documentation, which is, users don’t read computer documentation.
23:47 - And then the corollary to that is if they do read it, they’re not reading it front to back, they’re looking for specific things.
23:55 - And doing that in a PDF file can be very, very difficult.
24:00 - Nobody really reads a manual like you would a book.
24:07 - So we do have examples of places where we can do user documentation on.
24:14 - And I’m just going to point out. Let’s see here.
24:21 - Stand by for just a moment, I want to make sure I’ve got this.
24:32 - Got it up, there we go. We’re back two.
24:44 - So what’s on the screen now is user documentation for Workday.
24:51 - And we were able to convince early on before Workday was deployed on campus that– they initially wanted to do the user documentation in PDF files, and we convinced them that that was not the way to go.
25:05 - And so what we’ve been able to do is put this documentation in a place that’s easy to find, easy to use, the text flows well, and it works well for everybody.
25:20 - So it’s entirely possible to do this, and to do it successfully.
25:28 - And that’s in the user guides in the Integrated Service Center.
25:33 - So for further reading for those who are going to pick up the PowerPoint after the presentation, I’ve got a couple of links here for further reading or further discussion.
25:49 - This first one was written just last year by the Nielsen Norman group, so it’s not stale.
25:57 - The one below this, where it talks about PDFs being a strange otherworldly out of browser experience is a bit older, but I think it’s still relevant.
26:08 - And then also, another recent document that talks about things more from a commercial search engine optimization viewpoint is this one titled “Why Are PDFs (mostly) Awful and What’s the Alternative?” So how do we publish better documents? Well, part of it I think is getting some better training out there, which is part of what we’re doing today.
26:33 - Making tools to fix the existing PDFs better.
26:41 - We’ve got some good ones. Microsoft’s ability to export a good PDF is well-known as part of Microsoft Word.
26:51 - So we need to make sure that people are using this.
26:53 - Do we want to do we want to institute policies to discourage use of PDF? I don’t have answers for that, but I think it’s something for all of us to think about.
27:04 - But really, we want to talk more with our members of faculty, because they’re the ones that are putting a lot of the educational materials online.
27:14 - And anybody who is supporting staff in the campus environment would do well, and Terrill is going to go into some detail on creating good campus based content a little bit later on.
27:29 - So all this said, you get the impression I don’t care for PDFs.
27:32 - And you would be correct, but there is a place for them.
27:36 - And I didn’t want to sell them completely short, but if there is a document that should be printed, it’s a great format for that.
27:47 - It really is great for something that’s going to be printed.
27:50 - So things like posters and brochures, or anything that requires an actual physical signature, PDF file.
28:00 - So it’s going to get printed out, it’s going to get signed, maybe returned in person or by mail.
28:07 - PDF is an appropriate format for that. And of course, there are some “official”– and I use that word in quotes– or legal documents that are produced in PDF, because there are expectations that these official documents look a certain way.
28:25 - And I’d like us all to push back on that, because I have been told that some of these forms are official when there’s really no official reason for them to be official.
28:39 - It’s just what people know and what they’ve been using.
28:44 - So I want to encourage folks to rethink what a document is, and how we consume information now versus maybe 20 years ago.
28:55 - When we’re coming up with a report or a brochure or any kind of information, we want to design it from the start to be viewed on a screen.
29:06 - And we really want to get beyond this paper think idea that just seems ingrained in so many people.
29:16 - And young people as well, it’s not an old versus young issue.
29:21 - But a lot of that is how they’re taught is, what is this thing going to look like when it gets printed? Well, a lot of this stuff is never going to get printed.
29:29 - It’s only going to be consumed on a screen.
29:31 - And so we want to get rid of this idea of a page metaphor for our documents, because the size of the page is not relevant.
29:40 - If you’re viewing it on a handheld device versus the situation I have here, I’ve got nearly three feet wide of screens, those are two very different things.
29:51 - The reading experience is very different. So what is a page? The visual style of our information is not more important than the content in that text.
30:04 - And I think it’s really important for us to keep that in mind.
30:09 - So I’m going to stop sharing now. We’re going to hear from Gaby who’s going to talk about how to get information out of our PDFs, as well as a little more information on what it costs to do remediation outside of the DRS office.
30:29 - Go for it, Gaby, GABY DE JONGH: All right.
30:31 - Thanks, Dan. Let me take a moment here, share my screen.
30:42 - Everybody could see my screen? DAN COMDEN: Yes.
30:45 - GABY DE JONGH: Excellent. So thanks, Dan.
30:49 - My name is Gaby de Jongh. I’m also a member of the IT Accessibility team, and when we were prepping for this we were trying to figure out different solutions for what could we use instead of PDF.
31:06 - And so we thought about, well, we can convert it to different formats.
31:12 - And so I’m going to talk about converting PDF to Word.
31:16 - Terrill’s going to talk about converting PDF to HTML.
31:20 - But one of the biggest misconceptions about PDF is that it can’t be edited or changed, but it’s actually quite easy to take information and edit or change a PDF document just by exporting it into a different format.
31:37 - So that kind of blows that theory out of the water.
31:43 - But I wanted to kind of give you a little bit of information as to the methodology for turning a PDF document into Word.
31:54 - And then, I’ll give you a little bit more information about the findings when I’ve accomplished that.
32:03 - So essentially, what I did is I just did a Google search for the term convert PDF Word for free, and took the top four search results and had a PDF and just went through these different top search results and just see what the output is.
32:26 - So I would run them through the PDF to Word converter and then I’d open them back up again in Word and run the accessibility checker to see if there were any errors.
32:40 - And then I also performed a manual review of the styles to see what we could compare.
32:49 - So the original document that I used for this conversion process is a completely tagged PDF, the image has alt text, the title is in H1 and the other supporting headings there are tagged as H2, lists are tagged as lists, the table has a column header and the document is identified as English.
33:17 - And then, we’ve got one sentence there at the end that is identified as French.
33:25 - So the first tool that came up during the Google search is the Adobe Convert PDF to word.
33:33 - And all of these solutions are all web-based, I didn’t want to download anything on my computer, I didn’t want to give anybody my credit card information for a 7-day free trial or anything like that.
33:45 - I just wanted to quickly find a solution for taking a PDF document and converting it to a Word document.
33:53 - And so Adobe Convert PDF to Word was the first item that came up, and it’s completely web-based.
34:01 - And for all of these, it’s just a matter of taking your file and dropping it into the web browser.
34:09 - But for PDF, it did actually require that I created an Adobe account in order to use the servers.
34:17 - It’s free, it’s still free, you just have to sign up with your email or something like that so that Adobe can send you annoying messages about buying the products.
34:29 - So I did that and then I opened it up in Microsoft Word and I ran the accessibility checker, and in the inspection results, I got a warning.
34:42 - And this is true for all of the output, to check the reading order of the tables.
34:48 - So even though the tables were created in PDF with a column header, that did not convert back in Word.
35:00 - So the table header did not stick when it was converted back toward.
35:07 - The H1 was actually marked as a title, and I’ve actually included a screenshot in here of the styles guide, which gives you kind of a visual representation of the structure of this document, you can kind of see there.
35:27 - The heading level one was actually marked as a title.
35:30 - And then the heading level two were marked as heading level one, which can be kind of confusing.
35:38 - List was still marked as lists, the document title was still intact, and the French language section there was also marked as English.
35:50 - The second method or the second option that came up was Simply PDF.
35:56 - And this was completely free, I didn’t have to put in my email, or anything like that.
36:01 - So I could just drag it into the browser and then it performed its conversion process and then I downloaded it.
36:07 - Ran the accessibility check, and again, it gave me the inspection results of a warning.
36:12 - That I needed to check the reading order of the tables and make sure that the column headers were marked.
36:19 - Again, same thing for the H1, was marked as a title.
36:25 - H2 was also marked as heading one, so kind of a similar output to what Adobe had as well.
36:36 - The second search came up with Free PDF Convert, and this is another one that’s free that does not require any sign up or email or anything.
36:47 - And when I ran the accessibility checker, the inspection results again gave me the warning for the table.
36:54 - And this time, this is really interesting, that the H1– instead of being marked as the title, was marked as normal text.
37:03 - But the H2s were still marked as heading one, lists were marked as lists and the document title was intact.
37:09 - But everything was marked as English language.
37:15 - And then the fourth method was a product called pdf2docx.
37:22 - Again, this is another free one, and this produced probably the worst results out of all of them.
37:29 - The accessibility checker came up with the warning for the tables, but then, it also came up with a warning for missing alt text for the image and the image was not in line with the rest of the text either.
37:44 - And as you can see from this particular screenshot, all of the contents for this output was marked as normal text.
37:52 - So essentially, we have no structure for this particular output.
37:57 - So that would be the least desirable of all of the outputs.
38:02 - So conclusions for this would be, if you wanted to convert your PDF document to a Word document– and there are many reasons to do so, one of them being that a PDF is really supported in the Windows environment, not so much in the Mac environment.
38:22 - So the conclusions would be to use the PDF to Word converter from Adobe Acrobat, but if you didn’t want to sign up and get constant annoyances from Adobe, you could use Simply PDF as that does maintain most of the structure.
38:42 - But then of course, you’ll need to touch up your tables, as those table headers are not converting.
38:52 - I wanted to share a little bit more information about the cost of remediation.
38:58 - I’m actually in the middle of a pretty big project right now, we are working on making Canvas model courses that are available to UW to review and see what a model Canvas course looks like.
39:14 - We’re in the process of taking these canvas model courses and making the content accessible.
39:20 - And I wanted to share with you one of the courses so you could get a better idea of all of the work and the cost that is associated with, retroactively, making a Canvas course accessible.
39:38 - So for this particular Canvas course, it has a accessibility score of 57%.
39:45 - And you can see that there are 523 elements associated with this course, about 184 PDF documents, 57 Word documents and some other items there.
39:59 - And you could see that out of all of these documents, there’s about 189 that have a very low score, and I believe most of those are PDF that do require remediation.
40:16 - So I want to break this down a little bit more for you.
40:21 - So for this particular course, there are about 54 documents, which equal to 311 pages.
40:30 - For PowerPoint, there were 87 decks, which included 1,918 slides.
40:37 - For Excel that were 36 workbooks, with 61 worksheets.
40:43 - PDF, there were 182 PDF documents, for a total of 2,424 pages.
40:52 - Now, we have a contract with a PDF remediation vendor called Open Access Technologies.
41:00 - And I sent all these documents to the service for a quote, and we actually have a standard quote of $8 per page for remediation.
41:13 - But because there are so many documents we got an even bigger discount, a volume discount for $6 per page for remediating all of these documents.
41:28 - And for some of the PDFs, they were quizzes.
41:33 - And they had some form elements, or they should have had some form elements in order to be utilized accurately.
41:42 - And so there was an additional hourly cost for adding tool tips and form fields to some of these documents, and that was 62 and 3⁄4 hours at $25 an hour to add that additional information.
42:00 - So the total cost just for remediation of all of these documents is $29,852.
42:10 - And it took me about 8 and 1⁄2 hours to go through this Canvas course and audit all of the files to pull them down off of the Canvas course, collect them and then put them in a shared folder so that I can share them with our remediation service.
42:32 - And just, a lot of my administrative time, just to collect these documents.
42:39 - And I’m not done yet, I haven’t received the completed files from the remediation service, so I still have to replace them in the Canvas course.
42:49 - So there is additional time that needs to be included there.
42:54 - So it does really kind of add up in terms of monetary cost and time for retroactively making a Canvas course accessible and remediating PDF documents.
43:10 - Now had this instructor from the beginning thought about accessibility ahead of time and created accessible content while they’re putting this course together, it may have taken a little bit longer, admittedly to get this course together.
43:26 - But it would have probably saved the University a lot of money and saved the University a lot of time, had they put the effort up front into making the content accessible.
43:38 - So that’s pretty much all I wanted to share with you, and I’m going to go ahead and turn it over to Terrill.
43:51 - TERRILL THOMPSON: Thanks, Gaby. And just to clarify, that was one course of several.
43:56 - I’d forgotten how many were in the set, somewhere around eight to 10? GABY DE JONGH: Yeah, I think there’s actually nine courses.
44:02 - TERRILL THOMPSON: Nine courses. And you’re seeing similar numbers in the other courses, too.
44:07 - So this is not just an isolated incident, it’s a pretty common trend in our online courses.
44:16 - So we’ve got just a little over 10 minutes left, close to 15 minutes left.
44:24 - I just pasted in to chat the URL of the archived webinar recordings, so we’ll share our slides there as well.
44:35 - And you also have this recording, but I think I am going to go over, but I got some good stuff.
44:43 - So I hope that you’ll stick with me, because it’s going to be fun all the way down to the final slide, I think.
44:51 - But if you do have a hard stop at 4 o’clock, then this is being recorded, so you can catch up later.
44:59 - Let me share my screen. So as Gaby mentioned, my goal is to get to HTML, because HTML really is the ultimate format.
45:18 - From the beginning, has had really good markup for structure.
45:23 - Headings have been there since the beginning, alt text for images have been there since the beginning.
45:27 - So we’re talking early ‘90s, HTML has been accessible.
45:34 - And in HTML 4. 0, which was many, many years ago– decades ago, they introduced a bunch of new elements that definitely set the bar for accessibility.
45:50 - So this is where we got labels and legends and field sets for forms and where we get table headers and the scope attributes and all the things that make tables accessible and much, much more in HTML.
46:07 - And then HTML5 is taking it even a step further with new semantic elements that are supported by screen readers.
46:14 - So ultimately, HTML is the best option. It works across operating systems, it reflows nicely.
46:22 - So all the criticisms that Dan had in the first portion of this presentation, HTML works, it addresses all of those things.
46:34 - And is cross-platform, which even though we’re spending so much money and so much time to make PDFs accessible, it really is a Windows-only solution.
46:44 - There is starting to be some support on mobile devices, both iOS and Android for a tagged PDF, but it still is pretty limited compared to what you can get with HTML.
46:56 - So I wanted to explore how do you get from PDF, because you’ve got tens, hundreds of thousands of millions of PDFs out there.
47:05 - And a lot of the PDFs we’re using in courses, in particular, are coming from third parties.
47:10 - And so we get a PDF from somebody, because it’s a good resource and we want to use it in our course, and that’s the only format it’s available in.
47:19 - So how do you get that and convert it into HTML? Is there a way to do that effectively? So that’s what I’ve been doing some research on, trying to find a good strategy for doing this, and I approached it from two perspectives.
47:35 - First, what is the best way to convert from PDF to HTML? So, similar to what Gaby was doing with Word, but I wanted to get to HTML.
47:44 - And second, what is the best way then to get that converted HTML onto the web? And the three environments that most of our web content is delivered in at the UW are the ones that I focused on.
47:58 - On Canvas, a Canvas page, a web page using WordPress or a web page using Drupal.
48:05 - So I started with the same original source document that Gaby started with, and I actually used two different versions.
48:12 - You’ve probably seen this if you’ve attended some of our other trainings, we use this document pretty regularly.
48:19 - And in its PDF form, if it’s tagged, then you’ve got the tag structure that Gaby was describing.
48:28 - We’ve got an image that have alt text, we’ve got heading one and heading two– we have two levels of headings.
48:34 - We’ve got a list, we’ve got tables that have explicit column headers identified in the PDF tag tree, and you’ve got a document that’s identified as English and one sentence as French.
48:47 - And so I looked at different methods for converting that to HTML to see which of these methods preserve the tag structure.
48:59 - First of all, I used Acrobat Pro DC. This is Desktop 2020, and both Windows and Mac will give you the exact same results.
49:10 - And the option is Export to HTML Web Page. And you get an image in a separate file, so it creates a folder and puts all the assets in there and then links to it.
49:24 - And that’s OK, it’s harder to distribute that way, but that works.
49:30 - And headings are preserved, but one heading in this example is in the wrong place.
49:37 - So it actually put “Textbook” above “Introduction to Physics Course Syllabus,” so it repositioned those.
49:44 - The list was coded as an unordered list, which is appropriate.
49:48 - However, the bullets didn’t work for me. And I tried this in both operating systems and it gave me the same results, so I don’t know why the bullets did not appear, but that was a visual flaw.
50:03 - Column headers, even though they’re tagged as table headers as TH in the PDF those are not exported properly, those are exported as TD.
50:13 - And the document is tagged as English, the French content is not tagged.
50:17 - The visual appearance is approximated, other than it got those headings in the wrong place but otherwise, it more or less preserves the visual appearance using inline CSS.
50:30 - So there’s a lot of additional CSS in the markup in order to preserve the look.
50:38 - So, OK, but not great. The second method was to Upload to Canvas.
50:47 - So there was actually a question in chat while you were talking, Gaby, about how you gathered so much data.
50:53 - And I know you did a lot of stuff manually, but there’s also the accessibility report in all Canvas courses and that’s available in the instructor menu.
51:03 - And that is made possible by a tool called Blackboard Ally.
51:10 - Ally is the name of the product. Blackboard is now the owner of that product, but it works in multiple learning management systems, including Canvas.
51:18 - And it does a few things– it checks the accessibility of materials that are uploaded into the course and provides instructors with feedback, but it also allows users to generate custom versions or alternative versions of everything that gets uploaded.
51:37 - And so if you upload a PDF, then a user or students can download an HTML version or versions in various other formats.
51:50 - And when you do that, the image is recreated as part of the HTML document.
51:58 - And so it’s not a separate file, which makes it a lot easier to distribute.
52:02 - It’s using the base64 image source attribute, and so it encodes the image.
52:10 - And that then is in the source code, so it’s part of the document itself, not a separate file.
52:18 - The headings are preserved, so it got the heading one’s right, heading two’s are all right.
52:23 - The list is coded as an unordered list. It doesn’t try to stylize the bullets, so you don’t end up with those funky broken font icons, it just lets the browser render the list as it will by default.
52:39 - Column headers are correctly tagged as THs.
52:42 - And I think they even had scope = call, so they’ve got the scope attribute on there.
52:49 - Document’s tagged as English, the French content is not tagged, and so that actually was a problem.
52:54 - I found no solution, it sounds like Gaby found no solution, too.
52:58 - For the language issue, that’s not communicated well across platforms, so it’s kind of an isolated issue.
53:05 - If you have a multilingual document, then that would need to be addressed after you convert, but most documents probably are not going to fix that unless you’re in a Foreign language discipline.
53:20 - And again the HTML output actually, differently from what Acrobat created, it does include a CSS block that provides some styling but it doesn’t rely so extensively on CSS.
53:38 - And the thinking with HTML is, particularly if you’re going to plug it into another platform– you’re going to plug it into your Canvas course, or if you’re going to plug it in to WordPress, or to Drupal– you’ve already got a theme for that context, that website.
53:55 - And ideally, it won’t have a bunch of inline styles, it will just accept the theme and the document will plug in.
54:03 - It won’t look like it originally looked, but it’ll look like all the other pages within that website or within that course.
54:11 - And so that really is ideal, I think, to not have that kind of extra styling.
54:17 - But Ally does add a little bit of CSS so that you have some of the same styling that you had in the original.
54:29 - I also looked at another tool that we provide, this is available as a URL in the top corner, tinyurl. com/uw/doc/convert.
54:40 - That’s for our UW document conversion service, this is powered by SensusAccess, and so a third-party tool that we license.
54:50 - And the nice thing about this, the reason that we have it in place is because it takes DRS– Disability Resources for Students– some time to generate alternative formats on behalf of students.
55:02 - And so this is a service that students can use or anybody with UW. net ID can use to upload a document, get it back in a wide variety of formats through email, and just converting it to an alternative format so that you can access it more easily.
55:21 - And we have a number of students who use this on a regular basis, but for HTML, it doesn’t produce nearly the level of output that Ally does within Canvas.
55:35 - So the output is readable, it actually will do OCR.
55:39 - It’s like a scanned PDF, it’s just a picture of text, no actual text.
55:44 - It will convert that to text, so the document then is scanned and converted.
55:50 - It may have some errors, depending on how bad the original is, but there is text there.
55:56 - However, there’s no semantic structure. It doesn’t even make any effort to read the PDF tag tree, it’s under the hood.
56:05 - It just tags everything– everything is essentially a paragraph.
56:08 - [AUDIO OUT],, the table is tagged as a table but the headers are TDs, not THs.
56:18 - Everything’s a paragraph, all the headings and everything.
56:20 - And those bold headings, if they originally were bold, but doesn’t add any structure to them.
56:27 - So, not a great tool for converting to HTML.
56:33 - Method 4a, there are lots and lots of PDF-to-HTML conversion tools.
56:39 - So if you do something similar to what Gaby did and just do a Google search for PDF-to-HTML conversion tools, you’ll get dozens, maybe even hundreds of results.
56:50 - And I didn’t want to try them all, I wasn’t quite brave enough to do that, because I was afraid of what sort of malicious things I might be downloading to my computer.
56:59 - But I looked for top 10 conversion tool lists from credible sources, and compared them, and found a few tools that were referenced in multiple places and felt that those were worth the risk to try.
57:16 - And what I found– I’m not naming any names here, because I found that essentially they’re all the same– they produce PDFs that are exact, if not almost exact, if not exact replicas visually of the original.
57:33 - Really focused on preserving visual appearance, but they don’t have any semantics.
57:40 - I’ve got a screenshot here of some source code from one of these documents, and the whole thing is divs nested within divs nested within divs.
57:51 - Very deep levels of divs, everything in the entire HTML document is a div that has an enormous amount of classes and inline styles added to it in order to make it look the way it looks.
58:09 - So my conclusion from that is the only way to get from PDF to HTML with tag structure in place is if you’re using a tool that is specifically designed for that, so one that is focused on accessibility.
58:24 - Like Ally, it will get you where you need to get and exporting from Acrobat gets you there as well, but not one of these other tools that is not designed for accessibility.
58:38 - So that’s if you start with a tagged file. So recognizing that most of the PDFs out there are not tagged, they were not designed with accessibility in mind, how do we get to tagged PDF or tagged HTML? So we have the same document available in an untagged format, so it is just text.
59:02 - The image has no alt text, those headings are not really headings, they’re just big bold text.
59:08 - There’s no underlying tag structure at all, so accessibility is not possible in the PDF itself.
59:19 - So now, we’re relying on the conversion tools to assign tags intelligently.
59:26 - And with Acrobat Pro, if we export to HTML, then it does create an HTML with tag structure.
59:35 - And actually, it did a surprisingly good job, this is an area where this has improved over the years.
59:41 - The image, it doesn’t make any effort to intelligently assign alt text to that.
59:46 - As you probably have seen, the science there is getting better, where Microsoft Word– for instance– if you upload a document into Word doc, it will add an alt text using artificial intelligence.
59:59 - Not always great, and usually, it needs to be edited but at least they’re attempting to do something.
60:06 - In this case, it just says alt = image, every image gets that alt text.
60:11 - The H1 and H2 in this sample document were correctly tagged, and so that– your mileage may vary, depending on the complexity of your document, and whether– I don’t know what the algorithm is, but presumably, they’re looking at text size and position relative to other text and things like that.
60:31 - And in this case, it was able to intelligently identify the headings properly.
60:38 - The list was correctly tagged as an unordered list, the column headers, it missed that one.
60:45 - It assigned them as TDs, not THs. And again, no language attributes.
60:50 - And again, similar to the original export from Acrobat, we had a visual appearance that was approximated using inline CSS.
60:59 - Again, identical results in both Mac and Windows.
61:05 - And this is where Blackboard Ally really shines, I think, that if you feed it an inaccessible PDF it is able to intelligently convert that.
61:18 - It did not do anything with the image, and I need to play with this some more.
61:23 - I’m not quite sure why that is, but in the original, when it was tagged it was able to take that image and encode it into the HTML document.
61:33 - In this case, there was no image in the untagged document.
61:37 - So apparently, it’s using the tag tree in order to understand something about the image, which kind of surprised me.
61:45 - But I don’t know all of the inner workings of the underlying structure of a PDF, and so apparently, that’s a challenge getting an image out of an untagged PDF.
61:56 - The tags that are the headings H1, H2, correctly tagged.
62:03 - Again, that textbook– for some reason– that heading is in the wrong place, so that happens in multiple tools.
62:10 - The list is correctly tagged, column headers are correctly tagged, language– again, it misses that.
62:17 - And again, they’re like the original Ally output.
62:21 - There was some CSS, but not as much CSS if you go from Acrobat.
62:28 - But particularly, what kind of sets this apart is the column headers in the table.
62:34 - That was really a difference, that Ally’s able to do that, and Acrobat was not.
62:41 - And if the original document is tagged, Ally embeds the image in the HTML file, which also kind of sets it apart in terms of the conversion tools.
62:54 - SensusAccess, it wasn’t able to do much with the tagged PDF, and so I didn’t expect it to do anything better with an untagged PDF.
63:03 - And the results actually are identical, so its method works consistently no matter what you feed it, and it’s not great and there’s no structure.
63:15 - So given all that, the answer to the first part of my question, how do you get from PDFs to email? The best way from what we’ve just walked through is to use Ally.
63:31 - Ideally, you start with a tagged PDF but if that’s not available, Ally still does a pretty good job of intelligently adding structure to the HTML document.
63:42 - So that process then, upload the PDF to the Canvas course, go to the alternate formats menu and download the HTML version.
63:54 - If you don’t have access to Canvas, then using Acrobat Pro DC is another option, export to HTML from there.
64:04 - But again, a couple things at least in the testing that I did where Ally does a better job, but both of these could be viable.
64:12 - So then the second part of my question here in my research is, what is the best way then to take that converted HTML and plug it into your online course? And for these tests, I used the Ally output, since I was sold on what it was able to do.
64:34 - So a good tagged HTML, it’s got all the right stuff, or most of the right stuff.
64:43 - One method– so we focus on Canvas, first of all.
64:47 - If we copy the HTML document– open up the Ally-Converted document in your browser, copy it, then paste it into the Canvas Rich Content Editor where you go in and you create a new Canvas page.
65:03 - Just paste that content directly into the Rich Content Editor, not in the HTML editor, but in the visual Rich Content Editor.
65:11 - Then what you end up with is, all the HTML elements and attributes are preserved.
65:17 - The H1 element is preserved, even though each one in our Canvas environment is not available as an option.
65:24 - H1 is the title of the page, which you assign in another field.
65:30 - So ideally, there shouldn’t be an H1. The first level of headings would be H2, but if H1 is important in the original and you want to preserve that, then it does convert exactly as you copied it.
65:45 - And you’d paste it, it will save that H1. The base64 image, if you’re using the Ally version that includes that embedded image, that base64 image is stripped down.
65:59 - When you paste it into the Rich Text Editor, it actually is there, you can see it.
66:03 - And you can check the alt text, make sure that’s good, do all the things with image formatting that you can do within the image.
66:10 - But when you save it and publish it, that image is stripped out then.
66:15 - So I actually have raised a ticket, I sent a ticket to help at UW and have been talking with the Canvas people.
66:22 - Our service owner with Canvas is out of the office until after Independence Day, but this has been escalated to Instructure to see if it’s even technically possible for them on their end to preserve that base64 image.
66:42 - And so I’ll keep y’all posted as to whether it’s possible to keep that.
66:51 - But anyway, that [AUDIO OUT] stripped out, and so you would have to add images back if you use this method.
66:58 - The tables have no border, but those can easily be added, and so you paste the content in and then you do some touch-up.
67:05 - And you’re probably going to have to do a little bit of touch-up, regardless of which method you use.
67:12 - One really interesting caveat that took me a while to test this thoroughly and really track it down is that, if you copy from any browser other than Firefox– and there are a couple of bullets here of everything I tried– copy from Chrome or Safari in Mac OS, copy from Chrome or Edge in Windows 10, and then paste that into Firefox in either Mac or Windows in the Rich Text Editor in Canvas.
67:44 - Then, what you get is all the inline styles are added and preserved.
67:51 - And so I showed the source code here, that everything that was in that Ally doc where it did have some inline styles, everything gets preserved.
68:01 - And if you start with the Acrobat exported HTML file that has a lot of inline styles, again, it’s the same thing.
68:10 - That content gets preserved, but only in Firefox, and only if you copy from a browser other than Firefox.
68:18 - So if you copy from Firefox and paste into the Rich Text Editor in Firefox, this doesn’t happen.
68:27 - For whatever reason, Firefox seems to be taking that inline style content and preserving it from other browsers, but not from itself.
68:36 - And I honestly don’t have a good explanation for why that’s happening, or whether it’s a feature or a bug, but it is happening.
68:44 - I tested it extensively, and it reliably happens.
68:48 - So if anybody knows why, I would love to talk further about that.
68:52 - And hopefully, it’s not a bug that they’re going to fix and this goes away, because it might be useful.
68:57 - However, as Dan has pointed out and I’m convinced too, the appearance really should not be a driving force here.
69:07 - That ideally, when you place content into your Canvas course it will look like all your other Canvas pages, not like a separate document and it’s the content that really reigns supreme.
69:20 - So adding to WordPress– if you open up in a browser, copy it, paste it into WordPress– and I did this with our accessible technology website.
69:32 - We are using the hosted WordPress service from UMAC, and this is the boundless theme.
69:44 - It adopts the styles of the theme, and so the inline styles may actually be there, but they’re overwritten in most cases by the styles from the theme.
69:58 - And the base64 image is stripped out, I have not talked to UMAC to see whether that could be preserved there.
70:08 - But it looks like the theme, it’s branded properly for the website, and that really is ideal.
70:15 - It should look like other pages on the site.
70:17 - I don’t think it’s important that it preserve the styles.
70:22 - If you add to Drupal, copy and paste Ally-Converted HTML, paste that into the Drupal editor.
70:32 - And this is the default Drupal editor, so I just used a clean copy of Drupal 8– but nothing other than out of the box Drupal, and pasted, and this was the one place where the base64 encoded image is in fact preserved.
70:53 - And it has alt text because the alt text was there in the original.
70:59 - In all of these cases, the structure that’s there is preserved, inline styles are added a little bit and again, the rule of Firefox and other browsers– that relationship is true regardless of what you’re pasting into.
71:19 - And the catch here, though, is that the text format– when you paste into Drupal there is a text format field, and the options there are full HTML, basic HTML and restricted HTML.
71:34 - And with restricted HTML, you have particular tags that are allowed while others are not.
71:40 - Probably, the chances that you have full HTML enabled are pretty slim.
71:45 - I imagine most behind the scenes Drupal webmasters are tightening that up a little bit.
71:55 - But you do have to have full HTML in place to preserve the appearance and to get that base64 image.
72:04 - So my initial excitement may have worn off now that you know that.
72:09 - But if you do have access to a Drupal site where full HTML is allowed or maybe that is conditional based on privileges and certain users have full HTML privileges, then talk to your webmaster about that, because maybe that could be opened up enough so that people can paste HTML documents, including the images from those HTML documents.
72:33 - So given all that, the ultimate recommended workflow is to– again, Ally– so the first two bullets are the same as what I shared earlier.
72:45 - Ally is your best bet, Acrobat Pro exporting from that is the second best option.
72:53 - Then open the converted HTML page in the web browser, copy it, paste it into Rich Text Editor of your– whatever, your content management system, your learning system.
73:05 - And if preserving the original appearance is important, then you need to do that other browser– non-Firefox to Firefox trick.
73:15 - And then finally, no matter what you do, there’s probably going to be some touch-up required.
73:19 - And so, check it, make sure the headings are appropriate.
73:23 - If not, fix the headings. That’s probably the most important thing, is just make sure that you’ve got a good heading structure that forms an outline.
73:31 - Make sure that lists are coded as lists, and if not, select them and click the List button in the Rich Text Editor toolbar to make them into lists.
73:40 - Make sure your tables have headers, make sure your images have alt text, then probably– in most cases, unless you can get that base64 image to be preserved, then probably you’re going to need to re-upload images into the HTML page.
73:55 - So there may be a little bit of work as a final step to touch things up and make sure that it’s a good working functional document with all the information you want.
74:07 - But that, arguably, is going to be a lot less work than all the work that goes into fixing an inaccessible PDF as you’ve seen from my co-presenters here.
74:20 - So it looks like we didn’t lose too many people, we still have a crowd of 13.
74:27 - And it looks like some content in chat. Have others been monitoring that to see, are those actual questions? Are there any questions? DAN COMDEN: Yeah, there were a couple of questions, I think.
74:40 - We eventually got to them. I, of course, had it backwards at one point.
74:45 - But your slides corrected me and I made sure to point out that the pasting has to be into Firefox, which is just surreal.
74:59 - It’s like, is this 1990 again? What is going on? TERRILL THOMPSON: Yeah, and I wish I knew more.
75:06 - And I actually still plan to continue this investigation and see what I can learn.
75:12 - But I’ve never heard of this before. I don’t know if other people are even aware of it, if anybody at Mozilla is aware of that their tool has this magic functionality.
75:29 - We’ve got a little bit of time. Well, we’re way over time, but I think the stuff that Gaby and Terrill had to share was just fantastic.
75:41 - And I appreciate the collaboration, once again.
75:46 - Any questions, you can ask them here if you want to hang out for a little bit more, or you can contact us directly.
75:57 - There is a question there, Terrill. TERRILL THOMPSON: So in terms of students enrolled in a course that has PDFs posted in Canvas, using the Ally tool could make those docs accessible? The potential is there.
76:15 - It certainly does its best, and arguably does a better job than other automated tools, better than SensusAccess.
76:23 - But it also is garbage in, garbage out and this is all based on one document, which is a pretty simple document.
76:32 - And visually, it’s really easy to tell what the structure is.
76:35 - So whatever Ally’s algorithms are are not tested extensively with this document.
76:41 - But imagine if we throw a bunch of much more complex ugly documents at it, it’s going to be sort of hit and miss.
76:50 - It will be able to create accessible HTML out of those some of the time, but it’s going to be problematic other times.
76:59 - And that’s not a substitute, relying on that is not a substitute for creating accessible documents from the get-go.
77:09 - DAN COMDEN: Really, we need to go back to our content creators and really press them on this whole, why is it a PDF? Because if they think they’re doing it because it’s more secure or less changeable, they haven’t let’s say an incomplete understanding of what PDF is.
77:28 - It’s not secure, it can be changed. And really, in the interest of making things more readable, easier to use, more accessible.
77:39 - If they’re starting with a Word document, pasting into Canvas from that Word document is really a great way to go.
77:49 - It’s native to the platform, it’s going to flow well on all kinds of different devices, and it really is a superior way to present information.
78:01 - A comment about STEM fields. He’s using tech and lawtech.
78:06 - That’s kind of a whole other world, I will say that presenting math in PDF files is still a– I don’t want to see terrible.
78:18 - Can I say terrible? It’s not a good experience.
78:21 - It’s really not a good platform for sharing STEM content.
78:27 - Tech and lawtech is much better for writing and reading it, assuming people have the viewers for that.
78:33 - Canvas does have some good integration for lawtech.
78:37 - And then we get involved with producing Braille from lawtech as well.
78:43 - So that actually renders quite well when we take that next step of producing bumpy paper, or Braille documents, actual physical Braille documents but that’s really outside the course.
78:55 - The scope of this talk, I will say, that lawtech is actually pretty good.
79:02 - It’s another markup actually predates HTML.
79:07 - TERRILL THOMPSON: Also, Canvas now, it renders.
79:10 - You can use lawtech to author math equations, and it then renders those in MathML, which is becoming pretty well-supported by the system technologies.
79:23 - So I haven’t looked at converting math content.
79:27 - That was actually something, I saw an early prototype of Ally before it got purchased by Blackboard, where they were doing math conversion.
79:36 - So I know there is some of that going on behind the scenes, but I haven’t really explored that fully to see how successful that is overall.
79:47 - But I think some combination of all these things could result in accessible math within courses without a whole lot of extra effort.
79:59 - I know if you just type formulas directly into the Rich Text Editor using the math tool within the Canvas Rich Text Editor then you get accessible math, so that’s one nice way to go.
80:17 - DAN COMDEN: There is a question about PowerPoint or Google Slides in a more accessible format that is also non-editable.
80:29 - I personally encourage people to set aside the idea of non-editable.
80:35 - You can make it difficult to edit, but the Canvas instructor essentially is the one who gets to say what the source is.
80:48 - But I haven’t done a lot of instruction. I don’t know what all the ins and outs of that are, and maybe that’s a whole other presentation.
80:59 - If you want to make things non-editable, typically what you end up with is also inaccessible.
81:05 - JOE: Can I jump in? DAN COMDEN: Yeah, go ahead, Joe.
81:08 - JOE HANNAH: I’m sorry, I just wanted to clarify, that this is not a course.
81:13 - It’s not like what you’re doing now, you’re going to send out these slides to everybody, but what format are you going to send them out in and is that format accessible? We’re doing something very similar with our webinars, we do the webinar, and then we send the slides out.
81:26 - And they’re not to students, since it’s not on Canvas.
81:29 - Do you have any suggestions in that circumstance? TERRILL THOMPSON: We’re not going to send these out.
81:34 - All three of our desks are PowerPoint, and they will run the PowerPoint accessibility checker, and that will give us some tips if we overlooked adding alt text to some of these images and some other things that we need to look at.
81:52 - We’re not concerned about the content being non-editable or other people being able to use it.
82:02 - And so that part of the question we can’t relate to necessarily, but just in terms of accessibility, PowerPoint is a good format natively.
82:11 - Google Slides– if you do ultimately decide you want to get a PDF with an interest in non-editable, again, you can convert from PDFs as they’ve demonstrated to other formats.
82:25 - And so it’s not really can still get to that content.
82:28 - But if you export from PowerPoint to PDF, then it creates a decently accessible tagged PDF, all the alt text will be preserved.
82:43 - Each slide is represented by a heading, so it’s really easy for screen reader users to navigate.
82:48 - If you export from Google Slides, Google does not generate a tag PDF at all.
82:55 - So for docs or slides, definitely if you’re going to export, stay away from the G-Suite.
83:04 - DAN COMDEN: Aimee’s got her hand up, or had her hand up.
83:08 - You still with us? We are well over.
83:10 - AIMEE KELLY: Yes, I’m here, I’ll try to keep it quick.
83:14 - But I work with Joe, and so our questions are related.
83:20 - Dan, on your hierarchy that you had of the most desirable to undesirable electronic document formats, I think that PowerPoint was kind of down at the bottom close to the PDF.
83:36 - And so that kind of raised bells in my head and made me wonder if there was something we should be trying to export our PowerPoint files to, yet a different kind of format? DAN COMDEN: No, that’s an excellent point, Aimee.
83:55 - Thanks for catching that. PowerPoint, like Terrill just said, it can be pretty accessible.
84:03 - It’s not a very accessible platform to work within for anybody who needs to create content, but the content can be made very accessible.
84:15 - Again, what is the value in that PowerPoint, and would it be better served as being just an HTML file or set of HTML files? So the problem is that so much PowerPoint is created that is terrible for accessibility, because the analogy, you can do all kinds of inaccessible things very, very easily using that tool unfortunately.
84:49 - TERRILL THOMPSON: Also, Aimee, you and I have talked about this recently.
84:53 - We have done some recent research– looks like Hadi left.
84:58 - Hadi left early, what’s up with that? He kept us around for two solid days.
85:07 - DAN COMDEN: That’s hardly fair. TERRILL THOMPSON: Anyway, Hadi and Gaby and I sat down and did some extensive PowerPoint testing just observing how he interacts with PowerPoint.
85:25 - And essentially, he would go into slide view mode and would view it as a slideshow using a screen reader, and the results with JAWS versus NVDA are completely opposite one another.
85:42 - And so we’re working on gathering notes from that session and documenting what we found.
85:49 - And probably, following some bugs with screen reader developers to try and get some consistency there.
85:56 - But for that reason, I’m hesitant to say PowerPoint should be high on Dan’s hierarchy list, and through no fault of Microsoft’s I think.
86:06 - It’s just technology support. AIMEE KELLY: I think the other thing that we’re realizing too is just that where these are info sessions and group advising sites and we’re trying to stay consistent with the UW branding.
86:22 - And that is also something that gets very difficult to maintain, because you have to download and install special fonts and to see them in their proper format.
86:36 - And prospective students aren’t necessarily going to have those fonts downloaded, so then it all looks funky.
86:45 - TERRILL THOMPSON: They do publish those standard alternative fonts on their– AIMEE KELLY: That’s true.
86:50 - TERRILL THOMPSON: Boundless page, that’s what I use.
86:55 - DAN COMDEN: Joe has his hand up. JOE HANNAH: And I’m sorry.
86:58 - We’re still talking about PowerPoint, Aimee, but this actually goes to a bigger point.
87:04 - One of your arguments against PDF was that you have to have one more layer of proprietary software to read a PDF file, the same is true of PowerPoint and Word.
87:12 - And if you’re a UW student, yeah, you can get that for free but we’re dealing with a lot of prospective students and people outside the university.
87:21 - So is there a way to avoid using these other proprietary formats like Word and PowerPoint? I’ve been trying to find a way to convert a PowerPoint to an HTML.
87:37 - Again, this is a webinar, so we are doing a PowerPoint presentation.
87:41 - And I can’t find a decent way to convert a PowerPoint to HTML, they used to have it, but they’ve taken it out.
87:46 - Do you have any other suggestions or ways to do that? TERRILL THOMPSON: That actually would be another great topic for another session, I think, but there are quite a few free tools for doing HTML slides as an alternative to PowerPoint, it’s not exporting from PowerPoint.
88:06 - But Slidy, I think, was one, slider. js. A lot of tools the idea is to use standard HTML and then you add a JavaScript file to that that renders that standard HTML as an interactive slideshow and then you style everything using CSS.
88:29 - Actually, for a period of my life, I used that exclusively for slides because nothing beats the standard HTML with a little bit of CSS and JavaScript, and it works in your browser.
88:43 - But the challenge there was distributing it, that people just expect PowerPoint.
88:52 - Universal, I guess, and they want to copy my slides.
88:54 - And if I’m using one of those HTML tools, I have a bunch of files involved.
89:00 - So I could just send them the HTML file, which has all the content, but it’s not going to be rendered then as slides and it’s not going to have the visuals from the CSS.
89:10 - And so it just was less desirable from a sharing content perspective, but just if it’s going to be online, that might be a viable solution.
89:23 - I did, for a while, look at exporting from PowerPoint, too.
89:30 - And I haven’t looked at recently, but what they used to do in terms of when you export to HTML, it wasn’t good.
89:38 - Was nowhere near the quality of HTML that you get if you’re just creating it from HTML from scratch and using a tool like slider. js to render it.
89:47 - DAN COMDEN: PowerPoint to PDF to HTML, where does the madness end? TERRILL THOMPSON: Can’t we all just have one format? The world would probably be less interesting then, maybe.
90:06 - DAN COMDEN: All that said, I love ebooks. I’m on ebooks all the time, that’s my preferred method for reading books anymore.
90:13 - I still love actual print books, but when I’m reading nowadays it’s all– EPUB, Kindle format, all that stuff works really well.
90:22 - Still, getting a book in PDF is a real soul crusher though, because that is no fun.
90:29 - We are well over our scheduled time. Thank you everybody for sticking with us, it’s always a fun conversation.
90:42 - TERRILL THOMPSON: Stay cool in the forthcoming heat.
90:45 - DAN COMDEN: Hot, hot, hot coming. Stay cool.
90:49 - TERRILL THOMPSON: Thanks, everybody. .