Share The Content Strategy Experts - Scriptorium
Share to email
Share to Facebook
Share to X
By Scriptorium - The Content Strategy Experts
The podcast currently has 181 episodes available.
Are you looking for real-world examples of enterprise content operations in action? Join Sarah O’Keefe and special guest Adam Newton, Senior Director of Globalization, Product Documentation, & Business Process Automation at NetApp for episode 175 of The Content Strategy Experts podcast. Hear insights from NetApp’s journey to enterprise-level publishing, lessons learned from leading-edge GenAI tool development, and more.
We have writers in our authoring environment who are not writers by nature or bias. They’re subject matter experts. And they’re in our system and generating content. That was about joining us in our environment, reap the benefits of multi-language output, reap the benefits of fast updates, reap the benefits of being able to deliver a web-like experience as opposed to a PDF. But what I think we’ve found now is that this is a data project. This generative AI assistant has changed my thinking about what my team does. Yes, on one level, we have a team of writers devoted to producing the docs. But in another way, you can look at it and say, well, we’re a data engine.
— Adam Newton
Related links:
LinkedIn:
Transcript:
Sarah O’Keefe: Welcome to the content strategy experts podcast brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage structure, organize and distribute content in an efficient way. In this episode, we talk about content operations with Adam Newton. Adam is the senior director of global content experience services at NetApp. Hi everyone, I’m Sarah O ‘Keefe. Adam, welcome.
Adam Newton: Hey there, how are you doing, Sarah?
SO: It’s good to see and/or hear you.
AN: Good to hear your voice.
SO: Yeah, Adam and I go way back, which you may discover as we go through this podcast. And as those of you that listen to the podcast know, we talk a lot about content ops. So what I wanted to do was bring somebody in that is doing content ops in the real world, as opposed to as a consultant.and ask you, Adam, about your perspective as the director of a pretty good-sized group that’s doing content and content operations and content strategy and all the rest of us. So tell us a little bit about NetApp and your role there.
AN: Sure. So NetApp is a Fortune 500 company. We have probably close to 11,000 or more global employees. Our business is primarily data infrastructure, storage management, both on-prem. We sell storage operating system called ONTAP. We sell hardware storage devices, and we are most importantly, think, at this day and age, integrating with Azure, Google Cloud Platform, and AWS on first -party hyperscaler partnerships. My team at DENAP is… I actually have three teams under me. The largest of those three teams is the technical publications team. The other two teams globalization responsible for localization translation of both collateral and product. And then finally, and most new to my team is our digital content science team, which is our data science wing. Have about 50 to 53, think, employees at this point in my organization and all told probably about a hundred with our vendor partners.
SO: And so I think we all have a decent idea of what the technical publications team and the globalization teams do. Can you talk a little bit about the data science side? What does that team up to?
AN: Yeah, that’s a thank you for asking that question. So about two years ago, I was faced with an opportunity to hire. And maybe some of your listeners who are managers are familiar with that situation, right? I hope they are, rather than not being able to hire. I took a moment and thought a little bit more about what I needed in the future. And I thought a little bit differently about roles and responsibilities, opportunities inside NetApp and the broader content world and decided to bring in a data scientist. And then I thought a little bit more about, well, there are other data scientists at NetApp. Why would I need one? And I thought a little bit about the typical profile of the data scientists at that time at NetApp, mostly in IT and other product teams. Those data scientists were primarily quantitative data scientists coming from computer science backgrounds. And I thought, well, you know, we’re in the content business. I want to find a data scientist who is a content specialist and who has a background in the humanities and who also has skills in core data science skills, emphasizing, for example, NLP. And so that was my quest. And I was very, very fortunate to find a PhD candidate in English who wanted to get out of the academy and who had these skills. And it’s been an incredible boon to our organization. We’ve even hired a second PhD in English recently. And Sarah, since you and I are friends, I’ll say one was from UNC and one was from Duke. Okay. So we don’t have to have that discussion here. I’m an equal opportunity person. Although I did hire the UNC one first, Sarah.
SO: I see, I see. So for those of you that don’t live in North Carolina, this is… I’m not sure there is a comparison, but it is important to have both on your team. And I appreciate your inclusion of everybody. It is kind of like… I’ve got nothing.
AN: Yes.
SO: Okay, so you hired some data scientists from a couple of good universities. Or do they get along? Do they talk to each other?
AN: Fabulously, yes. No petty grievances.
SO: Okay, just checking. All right. So how do you, in this context then, what does your environment look like? What kinds of things are you doing with the docs team? And what’s the news from NetApp docs?
AN: So maybe a little bit of background actually, and you and I have talked about this previously, but we used to be a data shop. And then as things sped up inside our business with the adoption and development of cloud services at NetApp, we found that some of the apparatus of our data infrastructure, our past practices weren’t able to keep up to speed of the cloud services that were being developed. I think this is actually, I’ve talked to other people in our business, this is a very common situation. We handled it in one way. There are many ways to handle it, but the way we chose to handle it was to exit data and to move in our source format anyway to a format called ASCII doc, which I always frequently describe as a dialect of markdown. And we went from being a closed system of technical writers working inside a closed CMS to adopting open source. We now work in GitHub. Our pipeline is all open source and we have now contributors to our content that are not technical writers. In some cases, they’re technical marketing engineers, solution architects, and so forth as well as a pipeline of docs that we build through automations where we, for example, transform API specifications or reference docs that are maintained by developers and output those into our own website docs.netapp.com. In addition to just the docs part, my globalization team has been using for many years, machine translation. So speaking to one particular opportunity of being in one organization, when we output our docs and whenever we update our docs in English, they’re automagically updated in eight other languages and published to docs.netapp.com. So we roughly maintain 150,000 English files and you can times those by eight. Is that right? Did I do the math right? Yeah.
SO: Or nine, depending.
AN: Nine. Yeah. Is English the language? Yeah, sure. Let’s count it.
SO: Depends on how we use it. Okay, so you have an ASCII doc, you know, Markdown-ish. Is it fair to call it Docs as Code environment?
AN: So we often describe it as a content ops, environment. I’m not sure if that is, different from Docs as Code, but I think maybe I will accept that as a reasonable description in the sense that, we have asked our team members to think about the content that they’re writing as highly structured, semantically meaningful units of information. I think in the same way I think a developer can be asked to think of their code being that way and the systems in which we write in VS code, many engineers are writing in that.
SO: Mm-hmm.
AN: And of course our source files, as I mentioned, all in our automation and our pipelines are all based on being in GitHub.
SO: And so then you’ve got docs.netapp.com as a portal or a platform where a lot of this content goes. And what’s happening over there? Do you have any news on new things you’ve done there?
AN: Yeah. I mean, very recently, you know, the timing of this is really interesting. We, have been working on a generative AI solution, for a year, Sarah. you’ll recall the, the hype, right? When, when chat GPT exploded onto the, the, into the public consciousness, right? Through the media and, shortly thereafter, we began imagining what it might look like to leverage that technology, those types of technologies to deliver a different customer experience. And we identified a chatbot as being something we thought could add to the browse and search experiences on docs .netapp .com. And we just released that on the 20th of August announced it here internally inside of NetApp on the 27th. So we are literally like 48, 72 hours into a public adventure here.
SO: I take full credit for planning it, even though I knew nothing about any of this.
AN: Yeah. And that was a long time. I think it’s worth noting too. It was a long time. And I think it’s beyond the full dimensions of this, this discussion to talk about why it took so long. But I will say maybe to, you know, the, were early adopters and we felt, we felt the pain and the benefit of being that, you know, it was like, you know, changing the tires on a, on a race car, right? That was speeding around the track. So we had to learn and be responsive and also humble in the sense that there were some missteps that we had to recover from and some magical thinking, I think, at the beginning of the project that was qualified more over the course of the project.
SO: And so what does that GenAI solution sitting in or over the top of the docs content set, what does that do in terms of your authoring process? Do you have any, are there any changes on the backend as you’re creating this content that is then consumed by the AI?
AN: I would say we’re in the process of understanding the full implications of having this new output surface, this generative AI assistant, and fully grappling with what the implications are for the writers. We find ourselves frequently in discussions about audience. And audience is all those humans that we have been writing for and a whole bunch of machines that we now need to think more consciously about, you know, and it’s, we find ourselves often talking about standards and style, but not just from the perspective of, you know, writing the docs in a consistently patterned way for humans to be able to consume well, but also because patterns and machines are a marriage made in heaven. And we see actually opportunities to begin to think of the content we’re writing as a data set that needs to be more highly patterned and predictable so that a machine can consume it and algorithmically and probabilistically decide how to generate content from the content we’re creating.
SO: And where is this going in terms of what’s next as you’re looking at this? I think you mentioned that there’s other opportunities potentially to add more data slash content.
AN: Yeah, actually, if I back up to a detail and I shared, but maybe quickly, you know, we do have writers in our authoring environment who are not writers. They are by nature and by bias sort of, they’re, people who have their subject matter experts, right? And they’re in our system and they’re generating content. But I think that some of the opportunities that, so that was about join us in our environment, right? Join us in our environment, reap the benefits of multi-language output, reap the benefits of fast updates, reap the benefits of being able to deliver a web-like experience as opposed to a PDF. But what I think we’ve found now is that this is a data project. This generative AI assistant has changed my thinking about what my team does. And I think, yes, on one level, true. Yes, we have a team of writers and there’s a big factory devoted to producing the docs. But in another way, you can look at it and say, well, we’re a data engine. We own a large, own, maintain a large data set and the GenAI is one consumer of that data set. But we’re also thinking about our data set as being joinable to other data sets inside of NetApp. And in particular, I work inside the chief design office at NetApp, along with UX researchers and designers. And we’re also more broadly part of our platform team at NetApp, shared platform team. So we’re thinking about how might we join our data with other teams’ data to create in-product experiences that are data-led or data-driven in combination with curated experience. So if your viewers were to be able to see me, I am waving my hand a little bit, not because I’m dissembling, but more because I’m aspiring. And I think there’s a really, really cool future ahead for, a way, Sarah, that I think is super energizing for the writers, right? To see that their work is being reframed, not replaced or changed, right? The fear of writers with GenAI, right, of being replaced. Well, I would offer this as an example of, you know, maybe it’s not such a dismal view and maybe in fact there’s a very interesting future if you reframe your thinking about what you do and the opportunities to join what you do to create different experiences.
SO: And I think it’s an interesting perspective to look at GenAI as being a consumer of the content slash data that you’re putting out. A lot of the initial stuff was, this is great. GenAI will just replace all the tech writers. You’re talking about something entirely different.
AN: I guess I wanted to expand on that because I think we’re actually now hovering on a really important point. You know, what is your mindset? You know, what what how are you thinking about this moment in time? The broad we write you or the broader you us generally write who are in this industry. And, you know, I think we don’t see a great indication that GenAI can create net new content and do it well, honestly. I think you can write it summarizing, it can make your day-to-day, your meeting notes and so forth, Microsoft Co-pilot, right? There are some great uses, but I have not seen convincing, compelling indicators that docs can be written by, at least at the enterprise level, right? Our products are complex. We often talk about our writers as sense makers, right? And I think that we can take advantage of GenAI in the right ways. And I think this is one of the ways that we’re taking advantage of it, which is to give customers another experience. And frankly, also for us to learn a lot about what people are asking and assuming and we can learn a lot and continuously improve.
SO: So what’s happening on the delivery side? Somebody asks for some sort of information and it gives either, it says it doesn’t exist or it gives an incorrect response. Are you seeing any patterns there? What are you doing with that?
AN: Yeah, many of your listeners might have produced products themselves, right, or delivered products themselves and remembered what happens in the first day or two of releasing a product, right? So the timing of this chat is really good. Yeah, in the last couple days we’ve seen I was just talking to a data scientist on my team and I was saying, you know, what I think I see here emerging as a possible pattern is that people don’t actually know how to use these things effectively. That, you know, they ask of it questions that it really could never answer, or they don’t fully understand the constraints of the system, meaning that, well, it’s only based on a certain data set. you know, they don’t know that the data set doesn’t include the data they’re looking for, right? Because it sits somewhere else. You know, we’re modifying our processes to intake feedback. I think there’s a real interesting nexus is, is it the AI or is it the content? That’s the really interesting one, right? You know, was the content ambiguous, deficient, duplicitous, whatever, you know, is that a word?
SO: It is now.
AN: At UNC we use that word, not at Duke. But it is an interesting discussion inside our organization when we receive a piece of feedback, what’s causing it? Is it the interpretive engine or is it our source? And so we’re seeing a lot of gaps in our content, it’s exposing a lot of gaps or other suboptimal implementations.
SO: I mean, we’ve said that in a sort of glib manner, because of course you’re living this day to day and hour by hour, but we’ve said that, know, GenAI sitting over the top of a content set is going to uncover all your inconsistencies, all your missing pieces, all your, you know, over here you said update and over here you said upgrade. That was an example I heard from someone else. And so it basically uncovers your technical debt.
AN: Yeah, beautiful. Yeah, bingo. Yeah. Yeah. Yeah. You’re so right there. Terminology, right? my God. Can you believe how many things, how many ways we’ve talked to, talked about X, right?
SO: Right, and the GenAI thinks they’re different because, or it doesn’t think anything right, but the pattern isn’t there and so it doesn’t associate those things necessarily.
AN: Yeah, your listeners may commiserate with this, or the use of words as verbs and nouns, like cable. We often in our documentation talk about cabling devices. How would a GenAI know that the writer of the question is using cable as a verb or noun?
SO: Mm-hmm. So as you’re working through this and with your, you know, it sounds like two days of go live plus a year or two or three of suffering and a year and two days.
AN: Well, a year and two days, a year and two days.
SO: You know, I think you’re further along than lot of other organizations. Do you have any advice for those that are just beginning this journey and just looking at these kinds of issues? What are the things you did best or maybe worst or would do the same way or not? What’s out there that you can tell people that’ll maybe keep them from, you know, get them, get them or help them as they move forward?
AN: Yeah, but maybe think of it in the old people process systems dimensions. Actually, taking that latter one, systems, I would say beware the fascination of the system without thinking more about the processes and people that are going to be involved in the creation of some kind of generative AI solution. I think, you know, this is as much of an adaptive people process as it is a problem as it is a technical problem. Probably more frankly on the adaptive. And from a process perspective, I’d say, be curious about what you learn. Be attentive to the specifics, but look for the broad patterns in the feedback or what you’re seeing as you develop these solutions, you know, for me, I think I hinted at this before and I think it for me has been frankly, the epiphany of the project. There have been many, but I’d say I I would really highlight this one, which is what does my team do? What is the value of what they generate? And for me, yes, we are, you know, primarily a team that creates documentation, but you know, holy smokes, you know, the, the idea that we are data owners, and we govern a massive, semantically rich, non-determinant, fast-changing data set, that is super, super interesting. Even here inside NetApp, Sarah, we have teams reaching out to us who frankly before probably never thought about the docs. And all of a sudden, because we have this huge data set, they’re like, wow, we can, you know stress test our system or our new technologies using what they have. That’s a super cool moment for our team.
SO: Yeah, I think you’re the first person that I’ve heard describe this sort of context shift from this is content to this is data or this content is also data or however you want to phrase that. But I think that’s a really interesting point and opens up a lot of fascinating possibilities, not least for the English PhDs of the world. That’s super helpful.
AN: Is this where I confessed at one time trying to think I was going to be one of those and I got out because I realized I was terrible at it?
SO: No, no, no, that goes in the non-recorded part of the podcast. Yeah, I’m going to wrap it up there before Adam spills all of the dirt.
AN: Yeah, what am I compensating for, right?
SO: But thank you, because this is really, really interesting. And I think it will be helpful to the people listening to this podcast, because it’s so rare to get that inside view of what it really looks like and what’s really going on inside some of these bigger organizations as you move towards AI, GenAI strategies and figure out how best to leverage that. So thank you, Adam. And it’s great to see you.
AN: No, Sarah, thank you. And actually, I would like to thank my team. I mean, it has been an incredible adventure, and I think the team is really amazing.
SO: Yeah, and I know a few of them and they are great. So with that, thank you for listening to the Content Strategy Experts Podcast brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
The post Enterprise content operations in action at NetApp (podcast) appeared first on Scriptorium.
In episode 174 of The Content Strategy Experts podcast, Sarah O’Keefe and Alan Pringle explore the mindset shifts that are needed to elevate your organization’s content operations to the enterprise level.
If you’re in a desktop tool and everything’s working and you’re happy and you’re delivering what you’re supposed to deliver and basically it ain’t broken, then don’t fix it. You are done. What we’re talking about here is, okay, for those of you that are not in a good place, you need to level up. You need to move into structured content. You need to have a content ops organization that’s going to support that. What’s your next step to deliver at the enterprise level?
— Sarah O’Keefe
Related links:
LinkedIn:
Transcript:
Alan Pringle: Welcome to the Content Strategy Experts Podcast brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize, and distribute content in an efficient way. In this episode, we talk about setting up your content operations for success. Hey everyone, I am Alan Pringle and I am back here with Sarah O ‘Keefe in yet another podcast episode today. Hello, Sarah.
Sarah O’Keefe: Hey there.
AP: Sarah and I have been chatting about this issue. It’s kind of been this nebulous thing floating around and we’re gonna try to nail it down a little bit more in this conversation today. This idea of setting up your organization for success and their content operations. And to start the conversation, let’s just put it out there. Let’s define content ops. What are content operations, Sarah?
SO: Content strategy is the plan. What are we going to do, how do we want to approach it? Content ops is the system that puts all of that in place. And the reason that content ops these days is a big topic of conversation is because content ops in sort of a desktop world is, well, we’re going to buy this tool, and then we’re going to build some templates, and then we’re going to use them consistently. And the end, right? That’s pretty straightforward. But content operations in a modern content production environment means that we’re talking about lot of different kinds of automation and integration. So the tools are getting bigger, they’re scarier, they’re more enterprise level as opposed to a little desktop thing. And configuring a component content management system, connecting it to your web CMS and feeding the content that you’re generating in your CCMS, your component content management system, into other systems via some sort of an API is a whole different kettle of fish than dealing with, you know, your basic old school unstructured authoring tool. So yeah.
AP: Right. But in their defense, for the people who are using desktop publishing, that is still content operations.
SO: Sure, it is.
AP: It’s just a different flavor of content operations. And frankly, a lot of people, a lot of companies and organizations outgrow it, which is why they’re going to this next level that you’re talking about.
SO: Right. So if you’re in a desktop tool and everything’s working and you’re happy and you’re delivering what you’re supposed to deliver and basically it ain’t broken, then don’t fix it. You are done. You should shut off this podcast and go do something more fun with your time. Right? What we’re talking about here is, okay, for those of you that are not in a good place, you need to level up. You need to move into structured content. You need to have a content ops organization that’s going to support that. What do you do? What’s your, you know, what’s your next step and what does it look like to organize this project in such a way that you move into, you know, that next level up and you can deliver all the things that you’re required to deliver in the bigger enterprise, whatever you want to call that level of things. So desktop people, I’m slightly jealous of you because it’s all working and you’re in great shape and good for you. I’m happy for you.
AP: So making this shift from content operations and desktop publishing to something more enterprise level like you’re talking about, that is a huge mind shift. is also technically something that can be quite the shock to the system. How do you go about making that leap?
SO: Well, I’m reminded of a safety announcement I heard on a plane one time where they were talking about how, you know, when you open the overhead bins after landing, you want to be careful. And the flight attendant said, shift happens. And we all just looked at her like, did you actually just say that? And she sort of smirked. So making this shift can be, it’s can be, it’s difficult, right? And what we’re usually looking at is, okay, you’ve been using, you know, Word for the past 10, 15, 20, 57 years. And now we need to move out of that into, you know, something structured XML, maybe it’s DITA, and then get that all up and running. And so what’s going to happen is that you have to think pretty carefully about what does it look like to build the system and what does it look like to sustain it? Now here I’m talking particularly to large companies because what we find is the outcome in the end, right, when this is all said and done and everything’s up and running and working, what you’re probably going to have is some sort of an organization that’s responsible for sustainment of your content ops. So you’re to have a content ops group of some sort, and they’re going to do things like run the CCMS and build new publishing pipelines and keep the integrations moving and help train the authors. And in some cases, they’re kind of a services organization in the sense that you have an extended group of maybe hundreds of authors who are never going to move into structured content. So you’re taking on the, again, word content that they are producing, but you’re moving it into the structured content system as a service, like an ingestion or migration service to your larger staff or employee population. Okay, so in the future world, you have this group that knows all the things and knows how to keep everything running and knows how to kind of manage that and maintain it and do that work. And probably in there, you have an information architect who’s thinking about how to organize content, how to classify and label things, how to make sure the semantics, you know, the actual element tags are good and all that stuff. But right now, you’re sitting in desktop authoring land with a bunch of people that are really good at using whatever your desktop authoring tool may be. And you have to sort of cross that chasm over to, now we’re this content ops organization with structured content, probably a component content management system. So what I would probably look at here is, you know, what is the outcome? You know, thinking about the system has stood up, we’ve made our tool selection, everything’s working, everything’s configured, everything’s great. What does it look like to have an organization that’s responsible for sustaining that? And that could be, you know, two or three or 10 people, depending on the size, again, the size and scope of your organization and the content that you’re supporting. But in order to get there, you first have to get it all set up. You have to do the work to get it all up and running. Our job typically is that we get brought in to make that transition. Right? So we’re not going to be for a large organization, we’re not going to be your permanent content ops organization. We might provide some support on the side, but you’re going to have people in-house that are going to do that. They’re going to be presumably full-time permanent kind of staff members. They know your content and your domain and they have expertise in, you know, whatever your industry may be.
AP: Right.
SO: Our job is to get you there as fast as possible. So we get brought in to do that setting up piece, right? What are the best systems? What are the things you need to be evaluating? What are the weird requirements that you have that other organizations don’t have that are going to affect your decisions around systems and for that matter, people, right? Are you regulated? What is the risk level of this content? How many languages are you translating into? What kind of deliverables do you have? What kind of integration requirements do you have? And when I say integration, to be more specific, maybe you’re an industrial company and so you have tasks, service, maintenance kinds of things, and you need those tasks like how to replace a battery or how to swap out breaks to be in your service management system so that a field service tech can look at their assignments for the day, which are, you know, go here and do this repair and go here and do this maintenance. And then it gets connected to, and here’s the task you need and here’s the list of tools you need. And here are all the pieces and parts you need in order to do that job correctly. Diagnostic troubleshooting systems. You might have a chat bot and you want to feed all your content into the chat bot so that it can interact with customers. You may have a tech support organization that needs all this content and they want it in their system and not in whatever system you’re delivering. So we get into all these questions around where does this content go? You know, where does it have tentacles into your organization and what other things do we need to connect it to and how are we going to do that? So I think it’s very helpful to look at the upfront effort of configure or, you know, making decisions, deciding on designing your system and setting up your system versus sustaining, enabling, and supporting the system.
AP: There are lots of layers that you just talked about and lots of steps. It is very unusual, at least in my experience, to find someone, some kind of personnel resource, either within or hiring, who is going to have all of the things that you just mentioned because it is a lot to expect one person to have all of that knowledge, especially if you are moving to a new system, and you’ve got a situation where the current people are well versed in what is happening right now in that infrastructure, that ecosystem. To expect them to magically shift their brain and figure out new things, that’s a lot to ask for. And I think that’s where having this third-party consultant person, voice, is very helpful because we can help you narrow in on the things that are better fits for what you’ve got going on now and what you anticipate coming in the future.
SO: Yeah, I mean, the thing is that what you want from your internal organization is the sustainability. But in order to get there, you have to actually build the system, right? And nearly always when people reach out to us and say, we’re making this transition, we’re interested, we’re thinking about it, et cetera, they’re doing it because they have a serious problem of some sort. We are going into Europe and we have no localization capabilities or we have them, but we’ve been doing, you know, a little bit of French for Canada and a tiny bit of Spanish for Mexico. And now we’re being told about all these languages that we have to support for the European Union. And we can’t possibly scale our, you know, 2 .5 languages up to 28. It just, it just can’t be done. We’ll, we’ll drown. Or people say, We have all these new requirements and we can’t get there. We’ve been told to take our content that’s locked into, you know, page based PDF, whatever, and we’re being required to deliver it, not just onto the website and not just into HTML, as you know, content as a service, as an API deliverable, as micro content, all this stuff. And they just, they just can’t, you can’t get there from here. And so you have people on the inside who understand, as you said, the current system really well, and understand the needs of the organization in the sense of these things that they’re being asked to do and they understand the domain. They understand their particular product set internally. But it’s just completely unreasonable to ask them to stand up, support and sustain a new system with new technology while still delivering the existing content because, you know, that doesn’t go away. You can’t just push the pause button for five months.
AP: No, the real world does not stop when you are going on some kind of huge digital transformation project like one of these content ops projects. So basically what we’re talking about here, especially on the front end, the planning discovery side, is we can help augment, help you focus. And then once you kind of picked your tools and you start setting things up, there’s some choices there that sometimes have to do with like the size of an organization about how to proceed with implementation and then maintenance beyond that. Let’s focus on that a little bit.
SO: Most of the organizations we deal with are quite large. Actually, all of the organizations we deal with are quite large compared to us, right? It’s just a matter of are they a lot bigger or are they a lot, a lot, a lot, lot bigger?
AP: Correct.
SO: Within that, the question becomes how much help do you want from us and how much help do your people need in order to level up and get to the point where they can be self-sufficient? We have a lot of projects we do where we come in and we help with that sort of big hump of work, that big implementation push, and help get it done. And then once you go into sustainment or maintenance mode, it’s 10% of the effort or something like that. And so either you staff that internally as you’re building out your organization internally, or we stick around in sort of a fractional, smaller role to help with that. The pendulum kind of shifted on this for a while, or way back, way back when it was get in, do the work and get out. We rarely had ongoing maintenance support. Then for a bit, we were doing a lot of maintenance relative to the prior efforts. And now it feels as though we’re seeing a shift in a little bit of a shift back to doing this internally. Organizations that are big enough to have staff like a content ops group or a content ops person are bringing it back in-house instead of offloading it onto somebody like us. We’re happy to do whatever makes the most sense for the organization. At a certain size, my advice is always to bring this in-house because ultimately, your long-term staff member who has domain expertise on your products and your world and your corporate culture and has social capital within your organization will be more effective than offloading it onto an external organization, no matter how great we are.
AP: To wrap up, think I want to touch on one last thing here, and that’s change management. And yes, we beat that drum all the time in these conversations on this podcast, but I don’t think we can overstate how important it is to keep those communication channels open and be sure everyone understands what’s going on and why you’re doing what you’re doing. What we’ve talked about so far is very much, okay, we’ve come up with a technical plan, we’ve done a technical implementation, and now we’re going to set it up for success and maintain it for the long haul and adjust it as we need to as things change. But there are still a group of people who have to use those tools, your content creators, your reviewers, all of those people, your subject matter experts, I mean, I can go on and on here, they are still part of this equation here and we can’t forget about them while we’re so focused on the technical aspects of things.
SO: I would say this and directly to the people that are doing the work, know, the authors, the subject matter experts, the people operating within the system. I would look at this as an opportunity. It is an opportunity for you to pick up a whole bunch of new skills, new tools, new technologies, new ways of working. And while I know it’s going to be uncomfortable and difficult and occasionally very annoying as you discover that the new tools do some things really well, but the things that were easy in the old tools are now difficult, right? There’s just going to be that thing where the expertise you had in old tool A is no longer relevant and you have to sort of learn everything all over again, which is super, super annoying. But it’s fodder for your resume, right? I mean, if it comes to it, you’re going to have better skills and you’re going to have another set of tools and you’re going to be able to say, yes, I do know how to do that. So I think that just from a self-preservation point of view, it makes a whole lot of sense to get involved in some of these projects and move them forward because it’s going to help you in the long run, whether you stay at that organization or whether you move on to somewhere else, you know, at some point in the future. That’s one of the ways I would look at this. It is certainly true that the change falls on the authors, right?
AP: Correct.
SO: They all have to change how they work and learn new ways of working and there’s a lot there and I don’t want to you know sort of sweep that aside because it can be very painful. We try to advocate for making sure that authors have time to learn the new thing that people acknowledge that they’re not going to be as productive day one in the new system as they were in the old system that they know inside out and upside down that they get training and knowledge transfer and just, you a little bit of space to take on this new thing and understand it and get to a point where they use it well. So I think there’s a, you know, there’s a combination of things there. For those of you that are leading these projects, it is not reasonable, again, to stand the thing up and say, go live is Monday. So, you know, I expect deliverables on Tuesday. That is not okay.
AP: Yeah. And you’ve just wasted a ton of money and effort because you’ve thrown a tool at people who don’t know how to use it. So all of your beautiful setup kind of goes to waste. So there are lot of options here as far as making sure that your content ops do succeed. And I don’t think it’s like pretty much everything else in consulting land. It is not one size fits all.
SO: It depends, as always. We should just generate one podcast and put different titles on it and just say it depends over and over again.
AP: Pretty much, we’d probably just get an MP3 of us saying that phrase over and over again and just loop it and that will be a podcast episode. And on that not-great suggestion for our next episode, I’m gonna wrap this up. So thank you, Sarah.
SO: Thank you.
AP: I think she just choked on her tea, everyone.
SO: I did.
AP: Thank you for listening to the Content Strategy Experts Podcast brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
The post Position enterprise content operations for success (podcast) appeared first on Scriptorium.
Translation troubles? This podcast is for you! In episode 173 of The Content Strategy Experts podcast, Bill Swallow and special guest Mike McDermott, Director of Language Services at MadTranslations, share strategies for overcoming common content localization challenges and unlocking new market opportunities.
Mike McDermott: It gets very cumbersome to continually do these manual steps to get to a translation update. Once the authoring is done, ideally you just send it right through translation and the process starts.
Bill Swallow: So from an agile point of view, I am assuming that you’re talking about not necessarily translating an entire publication from page one to page 300, but you’re saying as soon as a particular chunk of content is done and “blessed,” let’s say, by reviewers in the native language, then it can immediately go off to translation even if other portions are still in progress.
Mike McDermott: Exactly. That’s what working in this semantic content and these types of environments will do for a content creator. You don’t need to wait for the final piece of content to be finalized to get things into translation.
Related links:
LinkedIn:
Transcript:
Bill Swallow: Welcome to the Content Strategy Experts podcast, brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize, and distribute content in an efficient way. In this episode, we explore strategies for conquering localization challenges, and unlocking new market opportunities. Hi everybody. I’m Bill Swallow, and with me today is Mike McDermott from MadCap Software. Hey Mike.
Mike McDermott: Hi Bill.
BS: So before we jump in, Mike, would you like to provide a little background information about you, who you are, what you do at MadCap?
MM: Sure. My name is Mike McDermott. I am the director of language services at MadCap Software working with our MadTranslation Group. And we support companies that work in single source authoring in multichannel publishing tools like those offered from MadCap Software for IXIA and MadCap Flare and Xyleme and other tools.
BS: So Mike, what are some of the challenges you’ve seen and what works for overcoming some of these localization challenges?
MM: One of the main challenges I see with companies that come to us, and they typically come to us because they’re looking at working in an XML-based authoring tool and they’re curious about the advantages it has for translation. And one of the biggest challenges I see initially with these companies is just figuring out what content needs to go into translation when you’re working in different types of tools. And one of the ways I see to solve that problem is working in a tool where you have the ability to tag certain content and identify content for different audiences or different purposes. It just makes it simpler to identify that content and get it straight into translation and removes a lot of the human error around packaging up content and trying to figure out yourself what files, house texts that might be translatable for whatever the output is that you’re looking to build. So just working in those tools I see inherently helps with translation because it helps you identify exactly what needs to be translated and it gets it into translation much quicker.
BS: So I think we’re talking about semantic content there and making sure that you have all the right metadata in place so that you can identify the correct audience, the correct, let’s say versions of the product, whether to translate or not, and any other relevant information about the content. So you’re able to isolate the very specific bits of content that need to be translated and omit a lot of the content that necessarily isn’t needed for that deliverable.
MM: Exactly, Bill. It lets the technology tell you what needs to be translated in what houses text versus you trying to go through a file list and determine what do I need to send out to a translator to translate. The flip side of that is to just send everything for translation, but it’s very rare that anything in any given project for any type of system is going to need to be translated. So by tagging it in that way, you can quickly get into the translation and get things moving. And what I see happening at the end of these projects, oftentimes when you’re not working in those types of systems is you end up finding bits and pieces of content or different files that ended up needing to be translated that missed that initial pass. Now they have to go back through translation and you’re delayed. So just getting everything right the first time and relying on the tools to tell you exactly what needs to be translated by looking up metadata or different tags just simplifies the process and speeds everything up, helps translation get done quicker and just improves time to market for the end user to get their content out.
BS: So it sounds like it reduces a good amount of friction, especially with regard to finding missing bits and pieces that should have been translated and weren’t, and then needing to go back and make sure that’s done in time. What are some other ways that people can reduce friction in their translation workflow?
MM: Well, a big emphasis for us over the past few years around removing friction is working with connectors and different technologies that can orchestrate the translation process. So we can automate a lot of this and remove the bottlenecks around someone having to, like I said before, manually go into a set of files and package things up for a translator, zip up files, upload them to different locations, and they just get passed around and things can happen when working that way, even outside of just missing files. So working with connectors and these technologies that can connect directly into these systems and get the text right into translation, removing all those friction points just eliminates a lot of room for error in project delays, bottlenecks for tasks that can be easily handled by modern technology.
BS: And I assume that there’s probably some technology there as well that kind of govern other things, other parts of the workflow, like review, content validation, that type of thing?
MM: Exactly, exactly. So we’re trying to automate the flow of data into the different points in translation and then get the content ready. For example, for reviewers, you mentioned reviewers. So once content gets into translation, we can get it right into the translation system from the authoring environment that the customer’s working in, get it into translation. And as soon as the translation is done, a human reviewer on the client side or on our side or whoever can be notified that this content is ready for translation and it just helps keep things moving. So now it’s on them to complete their translation. And once that’s done, the process can continue on and the automated QA checks, the human QA checks can be done at that point, and then the project can be pushed back to wherever it needs to go and put into publication. But by automating the steps and plugging in the humans where they provide the most value, it just removes the time costs in error-prone steps that don’t need to be there.
BS: So it sounds like a lot of it does come down to saving a good deal of time. I would also imagine that these types of workflows, they also help streamline a lot of the publishing needs that come after the translation as well.
MM: Correct. And that’s kind of why we started MadTranslation when we did, was to provide our customers a place to go to work with the translation agency that understood these tools and understand how these bits and pieces come together to build an output. We put it together to provide our customers a turnkey solution where they can get a working project back where they can quickly get into publication. By removing the friction points and using modern technology to automate a lot of these processes, we’re able to get things into translation and add a translation into the final deliverable much faster. So once that happens, we can build the outputs and we can check if it requires a human check on it, things can get to that point much quicker, and we’re not waiting for somebody to manually pull down files and putting them into another location so the next actually take place. We want to automate that part of it so we can get to that final output into a project file where a customer can plug it into their publishing environment and get it out as quickly as possible. A lot of the wasted time is around those manual steps, and when it comes to validation and review, it’s just the reviewers and validators maybe not being ready for the validation or not being educated on how it will work. So it’s important to make sure that everyone in that process knows how it’s going to be done, when things are going to be ready for the review or the QA checks. And then the idea from there is to just feed the content in via connectors, removing the friction point and just send it through. And this is necessary, especially when you’re doing very frequent updates and kind of a more of an agile translation workflow. It gets very cumbersome to continually do these manual steps to get to a translation update. Once the authoring is done, ideally you just send it right through translation and the process starts.
BS: So from an agile point of view, I am assuming then that you’re talking about not necessarily translating an entire publication from page one to page 300, but you’re talking about as soon as a particular chunk of content is done and it’s “blessed,” let’s say, by reviewers in the native language, then it can immediately go off to translation even if other portions are still in progress.
MM: Exactly. Exactly. And that’s what working in this semantic content and these types of environments will do for a content creator is you don’t need to wait for the final piece of content to be finalized to get things into translation. So as you said, it becomes even more important when you’re doing updates because you don’t want to have to send over the entire file set every time you’re doing an update. Whereas when you’re working in a more linear format like Word, you end up having to send that full file every time, and the translation agency is likely reprocessing it using translation memory. But all that stuff still takes time and working in these types of tools, you can very quickly identify those new parts or those bits that you know are ready for translation, tag them or mark them in some way and send them through the translation process.
BS: Very cool. So a lot of the work that we’re seeing now on the Scriptorium side of things is in re-platforming. So people have content in an old system or they have, say a directory full of decaying word files, and they want to bring it into some other new system. They want to modernize, they want to centralize everything, basically have a situation where they’re working in data or some other structured content, bring it into semantic content. What are some of, I guess, the benefits of doing that give you as far as translation goes when you’re looking at content portability? So being able to jump ship from one system to another.
MM: I think working in those systems where the text or the content is stored away from the output that you’re building has a lot of benefits to not only translation being able to just get the text that needs to be translated, exported out of the system and then put back where it needs to go. But it really future-proofs you and gives you the portability that you talk about to make changes because the text is stored in a standard format that can be ported versus you see some organizations getting locked into a closed environment to where when it goes to make a change, it requires certain types of exports to other type of file types that other tools can then import. But by storing them in a standard way in XML, for example, it gives you that flexibility in a future proves you from being locked into any one scenario.
BS: Excellent. So I have to ask, since I’ve come from a localization background as well, what’s one of the hairier projects that you’ve seen or one of the hairier problems that people can run into and in a localization workflow?
MM: One of the challenges we run into sometimes around client review, when you start incorporating validators into the translation system and include them as part of the process, when you get multiple reviewers. Sometimes that will happen where a company will assign a reviewer for every language, but you might have different people reviewing the same set of content. I mean, that’s the biggest delay that we see with projects is translations delivered and then the translation is dumped on a native speaker within the company’s desk and they’re asked to review it and they’re not ready to do the review, it’s not scheduled and it can delay the project. That’s one of the biggest delays we see. So that’s why we try at the front end of a project to figure out on the client side, what’s going to happen after we deliver this project, after we send the files, is the content going to be reviewed or validated? If so, let’s figure out a way to incorporate them into our translation system where they can review the translations before we build the outputs and do all the QA checks. So that’s one of the hairier situations in terms of time delays. Expectations around just time in general have always been a thing in localization. As you know, people can be surprised as to how long it can take for a translator to get through content. I mean, the technology is there certainly to speed it up. Since we’ve started MadTranslations a little over 10 years ago, we’ve seen the translation speed increase quite a bit, but it still takes time for a good translator to get through that content and know when to stop and do the research that’s needed to get a technical term right. So that’s one of the surprise moments I think for new buyers of localization is the time that it can take and there’s solutions in place, like I said, to make it go faster. But if you want that human review and that expertise and the cognitive ability to know when to stop and figure out what this term is or what the client wants or doesn’t want around certain terminology, and then to database it and then include that as part of the translation asset so it stays consistent every time. That takes time versus just sending something through a machine translation, doing a quick spot check and sending it back to the customer.
BS: So it sounds like having that workflow defined and setting those expectations that certain things need to happen at each point of that workflow. Some of it might be automated, some of it does require a person, and that person I guess should probably be identified ahead of time and given a heads-up that, “Hey, something’s going to be coming at you in three weeks. Be ready for it.”
MM: Be ready for it. And also, what are you ready for? So it’s kind of training a reviewer, what are you looking for here? Are we looking for key terms? Are we looking for style preferences? Everyone kind of understanding what it is that a reviewer is going to be looking for, and they might be looking for different things when it comes to technical documentation versus a website, for example. So just having everyone communicate and understand what the intended purpose of the final output is and where everyone fits in the process and defining a schedule around that process definitely helps.
BS: Definitely. I know myself, I’ve seen cases where working for a translation agency, having a client come to me and basically say, “I need this done as soon as possible. What can you do?” And it was a highly technical manual, and we said, “Well, we have an expert in these different languages. This person is available now. This one won’t be available until next month. And this person really only works nights and weekends because they are a professional engineer in their day job.” So turnaround is going to be a little slow, and the client persisted that we just need it as soon as possible. We need to get it out the door in a couple of weeks, and I’m thinking to myself in the back of my head, why are you coming to us now when you need this in a couple of weeks? You shouldn’t just be throwing it over the fence at the last possible minute and expecting it to come back tomorrow. So there was that education. Unfortunately, they decided that they didn’t care. They wanted us to use as many translators as possible and get it done as quick as possible. And we had them sign documents that basically said that we are not liable for the quality of the translation since the client is basically looking to get this done as quickly and cheaply and dirty as possible. It was a nightmare, and I think it took one round of review on the client side for them to basically circle back and say, “Okay, I get what you were saying now.” None of these translations work at all together, because we were literally sending out a chapter to a different translator and there was no style guide because the client hadn’t provided anything. There was no terminology set because the client didn’t provide anything and everything came back different. And they said, “Okay, we get it. We get it. We’ll revise our schedules, get it done the right way. I don’t care how long it takes.”
MM: I’ve run into something very, very similar to what you described, and it was put disclaimers in the documents to where this is going to be poor quality. We’re admitting it right now. This is the only way we’re going to get it back within a week, and we do not recommend publishing. And as soon as the files come back and so on, looks at it and says, “Okay, let’s back up and do it the right way.”
BS: Yes. I guess the biggest takeaway there is plan ahead and plan for quality and not just try to get it done as fast as possible.
MM: And that’s one of the benefits to where we sit at MadTranslations with MadCap Software companies, companies coming into these types of environments. They’re typically at the front end, the planning stages on trying to figure out how all this is going to work. So we have an ability to help them understand what the process looks like and then define it in combination with our tooling and their needs and come up with a workflow that’s going to keep things moving fast, but gives you that human level quality that everyone needs at the end.
BS: Being able to size up exactly what the process needs to look like before you’re in the thick of it definitely helps. And having that opportunity to coach someone through setting up the process for the first time, I’d say that’s definitely priceless because so many mistakes can happen out of the gate between how people are authoring content, what their workflow looks like.
MM: And it’s even more important for companies to have to maintain the content. So it’s one thing to just take a PDF and say, “Hey, I need to translate this file and I’m never going to have to update it again. I just need a quick translation.” It’s another to have a team of authors dispersed around the globe working on the same set of content that then needs to be translated continuously.
So different needs, but like you said, planning, defining the steps and knowing what the requirements of the content are from authoring to time to publication in each language, and how to fit the steps and to meet that as best as possible is best done, like you said, upfront versus when it needs to be published in a week.
BS: Planning, planning, planning. I think that sounds like a good place to leave it. Mike, thank you very much.
MM: Thank you, Bill. Thanks for having me on.
BS: Thank you for listening to the Content Strategy Experts Podcast, brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
The post Conquering content localization: strategies for success (podcast) appeared first on Scriptorium.
When organizations replatform from one content management system to another, unchecked technical debt can weigh down the new system. In contrast, strategic replatforming can be a tool for reducing technical debt. In episode 172 of The Content Strategy Experts podcast, Sarah O’Keefe and Bill Swallow share how to set your replatforming project up for success.
Here’s the real question I think you have to ask before replatforming—is the platform actually the problem? Is it legitimately broken? As Bill said, has it evolved away from the business requirements to a point where it no longer meet your needs? Or there are some other questions to ask, such as, what are your processes around that platform? Do you have weird, annoying, and inefficient processes?
— Sarah O’Keefe
Related links:
LinkedIn:
Transcript:
Sarah O’Keefe: Welcome to the Content Strategy Experts Podcast brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize, and distribute content in an efficient way. In this episode, we talk about replatforming and its relationship to technical debt. Hi, everyone. I’m Sarah O ‘Keefe. And the two of us rarely do podcasts together for reasons that will become apparent as we get into this.
Bill Swallow: And I’m Bill Swallow.
SO: What we wanted to talk about today was some more discussion of technical debt, but this time with a focus on a question of whether you can use replatforming and new software systems to get rid of technical debt. I think we start there with the understanding that no platform is actually perfect.
BS: Mm-hmm.
SO: Sorry, vendors. It’s about finding the best fit for your organization’s requirements and then those requirements change over time. Now Bill, a lot of times when we talk about replatforming, you hear people referring to the burning platform problem. So what’s that?
BS: Yeah, it’s well, it may actually be on fire, but likely not. What we’re really talking about is, you know, a platform that was chosen many years ago. Perhaps it’s approaching end-of-life. Perhaps your business needs have taken a, you know, a left or sharp left or right turn and it no longer, you know, the platform no longer supports those business needs or, you know, it really could be just a matter of cost. You know, the, platform you bought 10 years ago was, was built upon a very specific cost structure and model. And you know, the world is different now, and there are different pricing schemes and whatnot. And you may just want to, you know, replatform to recoup some of that cost.
SO: So does that, I mean, does that work? mean, if you exit platform A and move on to platform B, are you necessarily gonna save money? So no.
BS: In a perfect world, yes, but we don’t live in a perfect world. Yeah. I mean, I hate to be the bearer of bad news, you know, if you’re looking to switch from one, you know, from one platform to another to save costs, there is a cost in making that switch. And, you know, at that point, you need to look at, weighing the benefits and drawbacks, you know, is the cost to move to a new system going to be worth the cheaper solution in the long run. I mean, it’s a very, very basic model to look at. And there’s a lot of other costs and benefits and drawbacks to making a replatforming platform switch. But it’s one thing to consider there.
SO: Yeah, I think additionally, it’s really common to have people come to us and say, you know, our platform is burning. We’re unhappy with platform X and we want to replatform into platform Y. Now, what’s funnier is that usually we have some other customer that’s saying, I’m unhappy with platform Y and I need to go to platform X, right? So it’s just like a conveyor belt of sorts.
BS: You can’t please everybody.
SO: But the real question I think you have to ask before replatforming is, is the platform actually the problem here? Is it legitimately broken? And as you said, it’s evolved away from the business requirements to a point where they no longer meet your needs. And or there are some other questions to ask, like, what are your processes around that platform look like? Do you have weird, annoying, and inefficient processes?
BS: Mm-hmm.
SO: Do you have constraints that are going to force you in a direction that isn’t maybe from a technology point of view the best one? Have you made some old decisions that are now non -negotiable? So you’ll see people saying, well, we have this particular kind of construct in our content and we’re not giving it up ever.
BS: Mh-hmm.
SO: And you look at it and you think, well, it’s very unusual and is it really adding value, but it’s hard to get rid of it because it’s so established within that particular organization. So the worst scenario here is to move from A to B and repeat all the same mistakes that were made in the previous platform.
BS: Yeah, you don’t necessarily want to carry, well, you don’t want to carry that debt over, certainly. You know, so anything that you have established that worked well, but doesn’t meet your current or future needs. mean, absolutely. You do not want to move that forward. That being said, you have a wealth of content, a wealth of technology that you have built over the years and you want to make sure that you can use as much of that as possible to at least give yourself a leg up in the new system. So that you don’t have to rewrite everything from scratch, that you don’t have to completely rebuild your publishing pipelines. You might be able to move them over and change them and you might be able to move and refactor your content so that it better meets your needs. But I guess it’s a long way of saying that not only are you looking at a burning platform problem, but you’re also looking at a futureproofing opportunity. And you want to make sure that if you are going to do that lift and shift to another platform, that you, you take a few steps back and you look at what your current and future requirements are or will be and you make the necessary changes during the replatforming effort before you get into the new system and then start having to essentially deal with the same problems all over again.
SO: Yeah, I mean, to give a slightly more concrete example of what we’re talking about, relative to 10 years ago, PDF output is relatively less important. 10 years ago, we were getting a lot of, need PDF, we have to output it, and it has to meet these very, very high standards. People are still doing PDF, and clients are still doing PDF, but relatively, it is less of a like showstopper, primary requirement. It’s more, yes, we still have to do PDF, but we’re willing to negotiate on what that PDF is going to look like. Instead of saying it has to be this pristine and very complex output, they’re willing to drop that down a few notches. Conversely, the importance of HTML website alignment has gotten much, much higher. And we have a lot of requirements around Content as a Service and API connectors and those kinds of things. So if you just look at all your different publishing output connection pipelines 10 years ago PDF was really still unquestionably the most important thing and that’s not necessarily the case anymore.
BS: And on the HTML side, there’s also, could be HTML, could be JSON, but you do have a wealth of apps, whether it be a phone app or an app in your car or an app on your fridge that needs to be supported as well where your PDF certainly isn’t going to cut it. And a PDF approach to content design in general is not going to fly.
SO: So when we talk about replatforming, we tend to, in many cases, I look at this through the lens of, okay, we have, you know, DITA content in a CCMS and we’re gonna move it to another DITA CCMS. But in fact, it goes way, way beyond that, right? What are some of the, I guess, input or I’ll say legacy, but what are some of the formats that we’re seeing that are on the inbound side of a replatforming?
BS: Let’s see, on the inbound side, we certainly have maybe old models of DITA. So maybe something that was developed in DITA 1.1, 1.2, pre 1.0, something that’s heavily specialized. We have things like unstructured content, like Word files, InDesign, unstructured FrameMaker, and what have you. We’re also seeing that there’s an opportunity there as well to move a lot of developer content into something that is more centrally managed. In that case, we’ve got Markdown and other lightweight formats that need to be considered and migrated appropriately. And then, of course, all of your structured content. So we mentioned DITA. There’s DocBook out there. There are other XML formats and whatnot. And potentially, you have other things that you’re you’ve been maintaining over the years that now is a good opportunity to migrate that over into a system, centralize it, and get it aligned with all your other content.
SO: Yeah, and looking at this, I think it’s safe to say that we see people entering and exiting Markdown, like people saying we’re going to go from DITA to Markdown, but also Markdown to DITA. We’re seeing a lot of going into structured content in various flavors. Unstructured content, we largely are seeing as an exit format, right? We don’t see a lot of people saying, “Put us in Word, please.”
BS: No, no one’s going from something like DITA into Word.
SO: So they might go from DITA to Markdown, which is an interesting one. Okay, so I guess then that’s the entry format. That’s where you’re starting. What’s the outcome format? Where are people going for the most part?
BS: For the most part, there are essentially two winners. There are the XML-based formats, and then there is the Markdown-based formats. And I’m lumping DITA, DocBook, and other proprietary XML models all into XML. But generally, people are migrating more toward that direction than to Markdown. And there’s really a division there. It’s whether you want the semantics ingrained in an XML format and the ability to apply or heavily apply metadata. Or if you want something lightweight, that’s easy to author and is relatively, I don’t want to say single purpose, but it’s not as easily multi-channel as you can get with XML.
SO: Yeah, I mean the big advantage to Markdown is that it aligns you with the developer workflows, right? You get into Git, you’re aligned with all the source control and everything else that’s being done for the actual software code. And if that is a need that you have, then that is, you know, that’s the direction to go in. There are some, as Bill said, some really big scalability issues with that. And that can be a problem down the line, but Markdown generally, you know, okay, so we pick a fundamental content model of some sort, and then we have to think about software. So what does that look like? What are the buckets that we’re looking at there?
BS: For software, we’ve got a lot of things. First and foremost, there’s the platform that you’re moving to. What does that look like? What does it support? You have certainly authoring tools that are there. You also have all of your publishing pipelines. All of that’s going to require software to some degree. Some of it’s third party. Some of it’s embedded in the platform itself. And then you have all of your extended platforms that you are connecting to. Those might change. Those might stay the same. You might not change your knowledge base, for example, but you still need to publish content from the new system. The new system doesn’t quite work the way the old system did. So your connector needs to change. Things like that. I would also say that, you know, with regard to software, there’s also a hit. It’ll be a temporary blip, but it will be a costly blip in the localization space because when you are replatforming, especially if you are migrating formats to a new format, you’re going to take a hit on your 100% matches in your translation memory. So anything that you’ve translated previously, you’ll still have those translations, but how they are segmented will look very different in your localization software.
SO: Yeah, and there are some weird technical things you can do under the covers to potentially mitigate that, but it’s definitely an issue.
BS: And it’s still costly.
SO: OK, so we’ve decided that we need to replatform and we’ve done the business requirements and we picked a tool and we’re ready to go from A to B, which we are carefully not identifying because some of you are going from A to B and some of you are going from B to A. And it’s not wrong, right? There’s not a single, you know, one CCMS to rule them all.
BS: Mh-hmm.
SO: They’re all different and they all have different pros and cons. So depending on your organization and your requirements, what looks good for you could be bad for this other company. But within that context, what are some of the things to consider as you’re going through this? So you need to exit platform A and migrate to platform B.
BS: Mm-hmm. I think the number one thing you should not do is expect to be able to pick up your content from platform A and just drop it in platform B. Yeah, it’s never going to be that easy and it shouldn’t be something that you really are considering because not only are you replatforming, but you’re aligning with a new way of working with your content. So just picking it up and dropping it in a new system is not going to help you at all with in that regard. And given that you need to get the content out of the system, that’s the best time to look at your content and say, how do we clean this up? What mistakes do we try to erase with a migration project on this content before we put it in the new system?
SO: Yeah, I think the decisions that were made that tend to take on a life of their own, like this is how we do things. And much, much, much later you find out that it was done that way because of a limitation on the old software. This is like that dumb old story about, you know, cutting the end off the pot roast. And it turned out that Grandma did that because the roasting pan wasn’t big enough to hold the entire pot roast. It’s exactly that, but software, right? So bad decisions or constraints, you need to test your constraints to see whether your new CCMS, in fact, is a bigger roasting pan that does not require you to cut the end off the pot roast. What about customization?
BS: Customization is a good one. And what we’re finding is that a lot of the old systems or people who are exiting an older system for a newer system, they have a lot of heavy customization because there wasn’t a, in many regards, there wasn’t a robust content model available at the time. So they had to heavily specialize their content model and make it tailored to the type of content that they were developing. And now, you know, something that was built 10, 15 years ago that is using highly structured, specialized structured content. If you look at what’s available now, a lot of those specializations have been built into the standard in some way. So you can unwind a lot of that. It’s a great opportunity to unwind a lot of it and use the standard rather than your customization. That helps you move forward as the specifications for the content model change, you will be aligned with that change a lot better than if you had used a customization along the way. Specialization or any kind of customizations for that matter, you know, they’re expensive. They’re expensive to build. They’re expensive to maintain. They’re expensive to train people on. You know, they affect every aspect of your content production from authoring to publishing. There’s, something that needs to be specifically tailored, whether it’s training for the writers, whether it’s a training, you know, designing your publishing pipelines to understand and be able to render those customers, customized models, the translators that are involved, making sure that, you know, their systems can understand your tags if they’re custom so that they know whether, you know, that they can show and hide them from the translators and you don’t get translations back that contain translated tags, which we’ve seen. There’s a lot going on there. So the more that you can unwind, if you have heavily customized in the past, the better off you will be.
SO: Yeah, I think, mean, and here we’re talking, I think specifically about some of the DITA stuff. So if you’re in DITA 1.0 or 1.1 with your older legacy content, they added a lot of tags and did a 1.3 and they’re adding more and did a 2.0 that might address some of the things like you added a specialization because there was a gap or a deficiency in the DITA standard. So you could probably take that away and just use the standard tag that got added later. Now, I want to be clear that, I mean, we’re not anti-specialization. I think specialization is great and it’s a powerful tool to align the content that you have and your content model with your business requirements. And you have to make sure that when you specialize, all the things that Bill’s talking about, all those costs that you incur are matched by the value that you get out of having the specialization.
BS: Mm-hmm.
SO: So, you’re going to specialize because it makes your content better and you have to make sure that it makes it enough better to make it worthwhile to do all these things. Very, very broadly, metadata customization nearly always makes sense because that is a straight-up, we have these kinds of business divisions or variants that we need because of the way our products operate. And those nearly always make sense. And element specialization tends to be a bigger lift because now you’re looking at getting better semantics into your content. And you have to ask the question, do I really need custom things, or is this out of the box, did a doc book, custom XML content model good enough for my purposes? That’s kind of where you land on that. And then reuse, I did want to touch on reuse briefly because, you know, we can do a lot of things with reuse from reusing entire, you know, chunks, topics, paragraph sequences, list of steps, that kind of thing, all the way down to individual words or phrases. And the more creative you get with your reuse and the more complex it is, the more difficult it’s going to be to move it from system A to system B.
BS: Absolutely. It’ll be a lot more difficult to train people on as well. And we’ve seen it more times than not that even with the best reuse plan in mind, we still see, you know, what we call spaghetti reuse in the wild, where, know, someone has a topic or a phrase or something in one publication and they just reference it into another publication rather, you know, from one to the other. And it doesn’t necessarily, some systems will allow that. I’ll just put that out there. Other systems will absolutely say, absolutely not. You cannot do this. And you have to, you know, make sure that whatever you’re referencing exists in the same publication that, you know, that, that you’re publishing. so we’ve had to do a lot of unwinding there, you know, with regard to this spaghetti reuse and we’ve, we’ve had a podcast in the past with Gretel Kinsey on our side who I believe she talked extensively about spaghetti reuse. What it is what it isn’t and why you should avoid it. But yes as you’re replatforming if you know you have cases like this It’s best to get your arms around it before you put your content in the new system.
SO: Yeah, and we’ll see if we can dig it out and get it into the show notes. What about connectors?
BS: Connectors are interesting. And by that, we’re talking about either webhooks or API calls from one system to another to enable automation of publishing or sharing of content and what have you. For the most part, if you’re not changing one of the two systems, managing that connector can be a little bit easier, especially if it’s your target or the receiving end of the content is reaching out and looking for something else in like a shared folder using the webhook or using an FTP server, what have you. But generally, know, those webhooks can or sorry, those connectors can get a little sketchy. You know, it might be that your new platform doesn’t have canned connectors for the other systems that you have always connected to and need to connect to. So then you need to start looking at, well, do we need to build something new? we find a way of, find some kind of creative midpoint for this? They can get a little dicey. So I think it’s important to, before you re -platform, before you even choose your new content management system, that you look at where your content needs to go. And if you have support from that system to get you there.
SO: So a simple example of this is localization. If you have a component content management system of some sort, you’ve stashed all your content in, and then you have a translation management system. And the old legacy system, the platform you’re trying to get off of, has or maybe doesn’t have, but you need a connector from the component content management system over to the TMS, the translation management system, and back so that you can feed it your content and have the content returned to you.
BS: Mm-hmm.
SO: Well, if that connector exists in the legacy platform, but not in the new platform, you’re gonna have to either lean on the vendors to produce a new connector or go back to the old zip and ship model, which nobody wants, or conversely, you were doing a zip and ship in the old version, but the new version has a connector, which is gonna give you a huge amount of efficiency.
BS: Mm-hmm.
SO: The connectors tend to be expensive and also they add a lot of value, right? Because if you can automate those systems, those transfer systems, then that’s going to eliminate a lot of manual overhead, which is of course why we’re here.
BS: Mm hmm. Human error as well.
SO: So they’re worth looking at, you know, pretty carefully to see what that connector, as you said, Bill, you know, what’s out there, what already exists. Does the new platform have the connectors I need? And if not, who do I lean on to make that happen so that I don’t go backwards, essentially, in my processes? Okay, anything else or should we leave it there?
BS: I think this might be a good place to leave it. We could talk for hours on this.
SO: Be good place to leave it. Let’s not and say we did. OK, so with that, thank you for listening to the Content Strategy Experts podcast brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
The post Cutting technical debt with replatforming (podcast) appeared first on Scriptorium.
Just like discovering faulty wiring during a home renovation, technical debt in content operations leads to unexpected complications and costs. In episode 171 of The Content Strategy Experts podcast, Sarah O’Keefe and Alan Pringle explore the concept of technical debt, strategies for navigating it, and more.
In many cases, you can get away with the easy button, the quick-and-dirty approach when you have a relatively smaller volume of content. Then as you expand, bad, bad things happen, right? It just balloons to a point where you can’t keep up.
— Sarah O’Keefe
Related links:
LinkedIn:
Transcript:
Alan Pringle: Welcome to the Content Strategy Experts Podcast brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize, and distribute content in an efficient way. In this episode, we talk about technical debt and content operations. What is technical debt and can you avoid it? Hey everybody, I am Alan Pringle and I’ve got Sarah O’Keefe here today.
Sarah O’Keefe: Hey everybody.
AP: And we want to talk about technical debt, especially in the context of content operations. And to start off, we should probably have you define what technical debt is, Sarah. I think this is something most people run into during their careers, but they may not have had a label to apply to what they were dealing with. So what is technical debt?
SO: We usually hear about technical debt in the context of software projects. And it is something along the lines of taking the quick-and-dirty solution, which then causes long-term effects, causes long-term costs. So Wikipedia says it’s the implied cost of future reworking because a solution prioritizes expedience over long-term design. And that’s really it. You know, I have this thing, I need to deliver it this week. I’m going to get it done as fast as possible. But then later, I’m going to run into all these problems because I took the easy road instead of the sustainable
AP: So it’s basically when the easy button bites you in the backside weeks, months, years later.
SO: Yeah, and with any luck you are aware that you’re incurring technical debt. The one that’s really painful is when you don’t realize you’re doing it.
AP: Right, or you didn’t know you weren’t part of the process when it happened. And I think this is kind of moving into where I want to go next. Let’s talk about some examples, especially in the context of content of where you can incur or stumble upon technical debt.
SO: So right now, the example that we hear actually most often is that any inconsistencies and problems in the quality of your content, the organization of your content, and the structure of your content lead to a large learning model or AI misinterpreting information and therefore your generative AI strategy fails. So essentially, because the content isn’t good enough, genAI you know, tries to see patterns where there are none and then produces some stuff that’s just complete and utter junk. Now, the interesting thing about this is that probably you are aware, at least at a high level, that your content wasn’t perfect. But the LLM highlights that it’s like, it’s like a technical debt detector. It will show that, look at you, you took a shortcut and it didn’t work or you didn’t fix this and it didn’t. And so here we are. Another good example of this is any sort of manual formatting that you’re doing. So you’re producing a bunch of content, a bunch of docs, a bunch of HTML pages, PDF, whatever. And in the context of that, you’ve got some step in there that involves cleaning it up by hand. So I get it sort of 90—95% is I just apply the template and it all just works. But then I’ve got this last step where I’m doing a couple of little finicky cleanup things and that’s okay because it’s just an hour or two and all I’m delivering is English. Okay, well along comes localization and suddenly you’re delivering in not just one language but two or three or a dozen or 27 and what looked like one hour in English is now 28 hours, you know, once time for English and 27 times again where you’re having to do this cleanup. And so all of a sudden your technical debt balloons into something that’s basically unsustainable because that choice that you made to not automate that last 5% suddenly becomes a problem.
AP: It’s a scalability issue, really, at the core.
SO: Yeah, in many cases, you can get away with the sort of, as you said, the easy button, the quick-and-dirty approach when you have a relatively smaller volume of content. And then as you expand, bad, bad things happen, right? It just balloons to a point where you can’t keep up.
AP: Yeah, and I have recently run into some technical debt, not in the content world, but in the homeownership world. And I’m sure this painful story will resonate with many people and not in a good way. But how many times have you gone to update a kitchen, update a bathroom, only to discover that there was some weird stuff done with the wiring? The plumbing is not like it really should have been. And basically you want to jump into a time machine, go back to when your house was built to have either a gently corrective conversation with the people who are building your house or just murder them outright because you are now having to pay to untangle the mess that was made 30, 40, 50 years ago. I am there right now and it is not a happy place.
SO: And it would have been, whatever it was they did was presumably cheaper than doing it right. But what they actually paid to do it the cheap way, plus what it would have cost to do it right, you know, would have been an extra 5 % or whatever at the time. But now it’s compounded because you’re having to, you know, in the case of plumbing, you know, tear out walls and go back and replace all these pipes instead. So you have to essentially start over instead of just do it. Another great example of this is accessibility. So when you start thinking about a house that has grab bars or wide doorways that wheelchairs will fit through, right? If the house was built with it, it costs a little bit more, not a lot, but a little. But when you go back to retrofit a house with that stuff, it is stupidly expensive.
AP: Exactly. And really, these things that we’re talking about in the physical world very much apply when you’re talking about software infrastructure, tool infrastructure as it can be bad.
SO: Yeah, I mean, there’s a perception of it’s just software, right? We’re not doing a physical build. We’re not using two by fours. So how bad could it be? It can be real bad. But that is the perception, right? That we’re not building a physical object so we can always go back and fix it. And I mean, you can always go back and fix everything. It’s just how much is it going to cost?
AP: Right, how much time and money and effort is it going to suck up to get you to where you need to be so you can then do the next thing that you intended to actually do in the first place? So yeah, I think this is something where this technical debt, sometimes there is no way around it. You inherit a project, you’ve got some older processes in place and you’re gonna have to deal with it. Are there some strategies that people can rely on to kind of mitigate and make it less painful?
SO: Well, first I’ll say that not all technical debt is bad or destructive in a way. And the canonical example of this is if you’re trying to figure out is this thing gonna work, I wanna do a proof of concept, I don’t wanna see if the strategy that I’m considering is even feasible. So you go in and you take a small amount of content and you build out a proof of concept or a prototype, proof of concept like, look, we were able to generate this PDF over here and this HTML over here, and we fed it into the chat bot and everything kind of worked. And you look at it and you say, okay, so that was good enough. And because it was a proof of concept, you maybe didn’t sort of harden it from a design point of view. You just did what was expedient and you got it done. That’s fine, provided that you go into this with your eyes open, knowing where you cut the corners, recognizing that later we’re going to have to do this really well and we probably can’t use the proof of concept as a starting point, or it’s good enough and we can use it as a starting point, but here’s where we cut all the corners. You have this list of like, we didn’t put in translational localization support, we didn’t put all the different output formats we’re going to need, we just put in two to prove that it would more or less work. But I think you made a really good point earlier. So often you inherit these things. So you walk into an organization and you’re brand new to that organization and you get handed a content ops environment. This is how we do things. Great. And then the next thing that happens is that genAI comes along or a new output format comes along or, we’ve decided we want to connect it to this other software system over here that we’ve never thought about before, or, hey, we’re bringing in a new enterprise resource planning system and we need to connect to it, which was never on the requirements day one. And now you realize, looking at your environment, that what’s there won’t, you can’t get from what you have to where you need to be because the requirements shifted underneath you. Or you came in and you just didn’t have a good understanding of how and when these decisions were made because it was five or 10 years ago with your predecessor kind of thing. So. So how do we deal with this? It’s I mean, it just sounds awful, but it’s like you have to manage your debt just like actual debt.
AP: All right, sure.
SO: Right, so understand what you have and haven’t done. We have not accounted for localization. We’re pretty concerned about that if and when we get to a point where we’re doing localization. Scalability. We are only going to be able to scale to maybe 10 authors and if we end up with 20, we’re going to have a big problem. So let’s just be aware of that when we get to eight or nine. But the thing is you always have technical debt that you identify that you know about this is hopefully unlike personal finance, you always have more debt than you think you have, right? Because in the content world, things change. Or in your housing example, like the building code changes. So they built the thing, umpteen years ago, and it was okay in the sense that it conformed with the requirements of the building code at the time, I assume.
AP: Of course.
SO: And now you’re going in and you’re making updates and suddenly the new building code is in play and you’re faced with the technical debt that accrued as the building code changed, but your house, your physical infrastructure did not change. And so there’s a gap between where you need to end up and where you are, part of which is just time has elapsed and things have changed.
AP: Right, and that is very true of some of the requirements you mentioned in regard to content operations. Generative AI, that’s what, the past two years, if that, that wasn’t on the horizon five years ago when some decisions were made. it absolutely is very much parallels. And when it comes to personal finance, sometimes things get so bad, you have to declare bankruptcy. And I think that can also apply to technical debt as well.
SO: Yeah, it’s a, you know, it’s an unhappy day when you look at, you know, a two-story house and you’ve been told to build a 50-story skyscraper. It just can’t be done, right? You cannot take a, you know, a sort of a stick-boiled house made of wood and put 50 stories on top of it. At least I don’t think so. We’ve now hit the edges of what I know about construction. So sorry to all the construction people, you build differently if you know that it’s going to be required to be 50 stories. Even if you only build the initial two, so either you build two knowing that eventually you’ll scrape it and start over with a new foundation or you build what amounts to a two-story skyscraper, right, that you can then expand on as you go up. So you overbuild, mean, completely overbuild for two stories knowing you’re going forward.
AP: Scalability.
SO: But yeah, we have a lot, a lot of clients who come in and say, you know, we’re in unstructured content, know, word unstructured frame maker, InDesign, basically a PDF-only workflow. And now we need a website or we need all of our content in like a content as a service API kind of scenario. And they just can’t get there from a document page-based, print-based, PDF-targeted workflow, you can’t get to, and also I wanna load it into an app in nifty ways. I mean, you could load the PDF in, but let’s not. So you end up having to say, this isn’t gonna work. This is the, I have a two-story suburban house and I’ve been told to build a 50-story skyscraper. Languages, localization are really, really common causes of this. So separately from the, “I need website, in addition to PDF,” the, “We’re only going to one or two languages, but now we’re going to 30 because we’re going into the European Union,” is a really, really common scenario where suddenly your technical debt is just daunting.
AP: So basically you’re in a burn it all down situation. Just stop and start all over again.
SO: Yeah, I mean, your requirements, it’s not that you did it wrong. It’s that your requirements changed and evolved and your current tools can’t do it. So it’s a burning platform problem, right? The platform I’m on isn’t isn’t going to work anymore. And so I have to get to that other place. It’s really unpleasant. Nobody likes landing there because now you have to make big changes. And so I think ideally, what you want to do is evolve over time, evolve slowly, keep adding, keep improving, keep refactoring as you go so that you’re not faced with this just crushing task one day. But with that said, most of the time, at least the people we hear from have gotten to the crushing horror part of the world because it’s good enough. It’s good enough. It’s not great. We have some workarounds. We do our thing until one day it’s not good enough.
AP: And it’s very easy to get used to those workarounds. That is just part of my job. I will deal with it. You kind of get a thick skin and just kind of accept that’s the way that it is. While you’re doing that, however, that technical debt in the background, it’s accruing interest, it’s creeping up on you, but you may not really be that aware of.
SO: Right. Yeah, I’ve heard this called the missing stair problem. So it’s a metaphor for the scenario where, again, in your house or in your life, there’s a staircase and there’s a stair missing and you just get used to it, right? You just climb the steps and you hop over the missing stair and you keep going. But you bring a guest to your house and they keep tripping on the stairs because they’re not used to it, at which point they say, what is the deal with the step? And you’re like, yeah, well, you just have to jump over stair three because it’s not there or it’s got a, you know, missing whatever. So missing stair is this idea that you can get, you can get used to nearly anything and the workaround just becomes, “Get used to jumping.”
AP: And it ties into again, there’s technical debt there, but you have kind of almost put a bandaid on it. You’re ignoring it. You’ve just gotten used to it. Yeah, you do. So really, there’s no way to prevent this? Is it preventable?
SO: I mean, if you staffed up your content ops organization to something like 130% of what you need for day-to-day ops and dedicated the extra 30 or maybe 10%, but you know the extra percentage to keeping things up to date and constantly cleaning up and updating and refactoring and looking at new and yeah so no there’s no way to do it and everybody is running so lean.
AP: I’m gonna translate that to a no. That is a long no. So yeah.
SO: And as a result, you make decisions and you make trade-offs and that’s just kind of how it is. I think that it’s important to understand the debt that you’re incurring, to understand what you’re getting yourself into. And, you know, I don’t want to, you know, beat this financial metaphor to death, but like, did you take out like a reasonable loan or are you with the loan sharks? Like how bad is this and how bad is the interest going to be?
AP: Yeah, so there’s a lot to ponder here and I’m sure a lot of people are listening to this and thinking, I have technical debt and I’ve never even thought about it that way. it is a topic that is unpleasant, but it is something that needs to be discussed, especially if you’re a person coming into an organization and inheriting something you may not have had any say in the decisions that were made 10 years ago, five years ago, and things have changed so much that might be why they’ve brought you in. So it is something that you’re gonna have to untangle.
SO: Yeah, sounds about right. So good luck with that. Call us if you need help, but sorry.
AP: Yeah, so if you do need help digging out of the pit of technical debt, you know where to find us. And with that, I’m going to wrap up. Thank you, Sarah. And thank you for listening to the Content Strategy Experts podcast brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
SO: Thank you.
The post Renovation revelations: Managing technical debt (podcast) appeared first on Scriptorium.
In episode 170 of The Content Strategy Experts podcast, Bill Swallow and Christine Cuellar dive into the world of content localization strategy. Learn about the obstacles organizations face from initial planning to implementation, when and how organizations should consider localization, localization trends, and more.
Localization is generally a key business driver. Are you positioning your products, services, what have you for one market, one language, and that’s all? Are you looking at diversifying that? Are you looking to expand into foreign markets? Are you looking to hit multilingual people in the same market? All of those factors. Ideally as a company, you’re looking at this from the beginning as part of your business strategy.
— Bill Swallow
Related links:
LinkedIn:
Transcript:
Christine Cuellar: Welcome to the Content Strategy Experts Podcast, brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize, and distribute content in an efficient way. In this episode, we are talking about content localization strategy. So maybe you’re starting to think about introducing a localization strategy. Maybe you’re hitting some pain points in your localization processes, all that good stuff we’re going to be talking about today. Hi, I’m Christine Cuellar.
Bill Swallow: And I’m Bill Swallow.
CC: Bill, thanks for being here today to talk about localization. Bill is our go-to localization expert, and localization has been coming up a lot. So I noticed for me, on the marketing side of things, there’s been a lot of, you know, SEO stuff coming up for localization. People seem to be searching about it, asking questions at a more beginning to thinking about the whole localization process level. So that’s what we wanted to talk about today. Give you the chance to have some upfront knowledge about what you could be getting into with introducing localization in your content strategy. And yeah, let’s talk about it with an expert. So thanks, Bill.
BS: Thank you.
CC: First things first, the most basic question, what is content localization strategy? So what do we mean by that?
BS: Okay, so I can kind of frame this in, I guess the same point of view as a content strategy, but basically you’re taking a look at your entire localization process from start to finish. Plus you’re looking at what are the systems that are involved? How are authors prepping the content for localization? Are they writing well upfront? What does the publishing preparation look like? How are you choosing your translators? Are you going to pure machine translation? Are you using live people to do the translation? Are you using people who are content experts? Are you using people who are market experts? So there are a lot of different factors there that all kind of get balled up into this grander strategy of how are you going to approach getting your content authored and translated appropriately in other regional markets.
CC: Yeah, okay. That makes sense. And taking a step back even further, can you walk me through the difference between localization and localization strategy?
BS: Sure. Localization itself is kind of more of an action, and whereas strategy is more planning around that action, I think that’s the best way to put it. So localization involves a bunch of different things. It involves the act of internationalization. So that’s prepping your content, your code, your product, whatever it is to be delivered for multiple regional and language markets. And then you have the translation component of localization, which is actually getting things written, spoken however, in other languages. And the strategy piece is more bridging both of those and adding additional components so that you have a solid plan for every step in that process.
CC: Okay, yeah, that makes sense. And where do we step in? We here at Scriptorium, where do we sit?
BS: Generally we at Scriptorium, we sit on the source content authoring side. And we look at the overall content strategy, and we do look at a localization strategy as a component of that. They’re not separate. They’re very intertwined and we need to take a look at really both of them. So a lot of our clients do come to us because they have localization requirements.
And we have to account for those in the content strategy that we build for them. So we’re looking not only at the source content authoring process and what needs to happen in that to get the job done, but we also have to look at where are they going with their content, how are they going to localize it, what do they need to localize, what processes do they have in place now? Are they working? Are they not looking at systems? Are they adequate? Are they not? And look at the markets. Are they already reaching those markets? Do they need to do something different? How do we need to position the content as it moves through that funnel of production so that when it comes out the other side, it is ready for those markets. So they’re kind of intertwined there.
CC: Okay. Yeah. So when are organizations typically thinking about a content localization strategy?
BS: Well, localization generally it’s a key business driver. Are you positioning your content for one market, one language, and that’s all? Or are you positioning… I shouldn’t say just product because product services, what have you. Are you looking at diversifying that? Are you looking to expand into foreign markets? Are you looking to hit multilingual people in the same market? All of those factors. So ideally as a company, you’re looking at this from the beginning as part of your business strategy. And what are you doing to… What are you producing? Who are you producing it for? How do they need to consume it? So as soon as you catch a whiff of those multilingual requirements, bells should be going off saying, “Hey, we need a plan for this.” More commonly, an organization might be producing for one market or producing for several markets. They’re kind of doing things ad hoc, producing content, then sending it out to a translator. They’re getting something back, they may be polishing it up or it’s a finished product and then they send it out. It’s a very time-consuming process. It’s a very costly process, and it’s very difficult to kind of juggle when things will be done. Because if you don’t have a set process around things and you don’t have an idea of how long things will take, what efficiencies you’re able to build up front and so forth, you’re throwing caution to the wind and just putting stuff out there and hoping that it comes back in time so that you can go to market with it. We’ve worked with clients who have said that generally it takes about nine months or so to get their localized product out the door and into the market after the English is done. And for a lot of those, we’ve brought that number into three months, one month, depending on exactly what they’re producing and how they need to produce it, so-
CC: Yeah, it’s a huge difference.
BS: Looking at that… Oh, huge difference. And looking at that time to market, that’s perhaps more valuable than the cost that you’re dumping into putting a localization strategy or a content strategy together because you’re able to sell quicker into those markets. You’re not waiting for the opportunity to start seeing revenue come back from the initiatives that you’re taking to get stuff out there.
CC: Yeah. Yeah, that makes sense. And I feel like… So correct me if I’m wrong here, but in the global world that we live in, it feels like localizing products and getting them ready for new regions is a very… I think that would be something that executives think about from the get-go like, yes, of course we want our product ready for new regions and locations. But why is the… It sounds like maybe the content piece of that is not thought about or maybe left behind until it’s an absolute emergency. Would you say that that’s… First of all, is that accurate?
BS: Sadly, I’d say yes.
CC: Okay.
BS: Content is often an afterthought in general, whether we’re talking about producing stuff just in your native language for a native market. Localization is usually even more of an afterthought because it’s like, oh, well, we wrote it in English, we’ll just have someone translate it. And by then you’re waiting until that product is done and then sending it to somebody else who’s looking at it going, “I can’t make sense of this. It’s not written well. And I’m going to take my best guess at how to translate this.” It could take months to get that back.
CC: So maybe organizations see the value in having their products and services available in other markets, but they don’t necessarily think of all of the content localization pieces that are involved in getting that out the door.
BS: No, and it’s similar for pretty much anyone trying to get anything done that you want to do something. But for example, I really want to put a new patio in the back of my house. I know exactly… I even have an idea of exactly how that should go in. I don’t have the time. I don’t have the materials needed to do it. And I’d much rely on somebody else who knows what they’re doing to put it in the correct way so it’s not graded improperly, so that there aren’t uneven portions that people will trip over and so forth. So looking at it that way, the same thing with localization. People who are running a company or starting a company, they may have an idea that yes, they need to get from point A to point B to point C to point D. They don’t know those steps along that path, and they need some help figuring out, okay, it’s not that you just write your English content, you throw it over to somebody else and they send it back. It’s a more intricate process. You have some systems in place that we’ll manage that handoff that will allow people to gate the content and proof it and make sure it’s correct before it goes anywhere. And you may have some other efficiencies built in that allow you to automatically format things when the time comes to actually produce. So there are a lot of bits and pieces that people just generally don’t think about because it’s not in their wheelhouse.
CC: Yeah, they can’t know what they don’t know.
BS: Exactly.
CC: Okay. So it sounds like most organizations realize that this is a problem once they’re actually trying to get their product out the door and into a new market, into a new region. What are some obstacles to getting a content localization strategy set up? I’m sure that one issue is probably like, oh, you’re in emergency mode and we just need to get this product out the door. That might present a challenge in and of itself.
BS: Absolutely.
CC: Yeah. Are there other obstacles as well to getting a more future-focused strategy in place?
BS: Oh, that one is a good one. That is the first hurdle to get over.
CC: Is the emergency mode.
BS: So being able to recognize or realize that you’re in emergency mode and getting out of that mindset and saying, okay, it’s not just that this will be a forever problem of just waiting and hoping for good quality coming out in the end. Once you’re able to realize that you need to break that mindset and start looking forward, then we start hitting other obstacles. One of them is going to be funding because there will be systems involved, there will be personnel required, there will be processes that need to change and so forth. And that will certainly cost a lot upfront. You’re going to basically see that return on investment in a pretty quick amount of time. We’ve seen one company make their investment back within a year, but they were producing an insane amount of languages already, and they just needed to tidy up their process. And again, by bringing that window in from nine months to about a month and a half or so, to be able to get their localized stuff out, they were able to quickly realize that return on investment there. But another one is buy-in, because you have a lot of people who are busy doing their job and you’re suddenly telling them that they need to change how they do their job, and it might be abandoning the tools that they like to use. Writing in a different way, looking at publishing in a different way and interacting with people who they normally don’t interact with on a day-to-day basis. So your source author’s interacting with a localization manager internally who needs to send stuff out to translators or your writer’s interacting with translators to explain what they had written so that the translator has a definitive idea of what it is and how to translate it for the market that they’re translating for. And then of course, you have the obstacle of governance and change management comes along with that. You need to be able to make sure that any of the changes that you introduce, that people are following the new way of doing things and aren’t falling back to old bad habits or even old good habits at the time. And you need to make sure that you have these gating processes so that once something is written in English, you have a formal review on that to make sure it’s correct, to make sure it’s written appropriately. That goes out to translation. They have their own gating process of making sure they receive all the files, that they understand the content that they have, all the supporting information that they need to help them translate and localize this information for that market. Then of course, they do their own quality checks. It comes back, you make sure that there’s a final review on the company side to make sure the translation seems good. And then you’re able to publish and deliver. So it still sounds like a lot of gating factors, but once you kind of get things going and figuring out where you can expedite and make things a lot easier, you start to bring in that entire timeline.
CC: Yeah, that makes sense. You mentioned buy-in, and so I could see how if people feel like their workload’s being increased by suddenly needing to talk to more people, coordinate between more departments or even just have more things on their radar, I could see how that could create a lot of, oh, I don’t know if I want to go in this direction. What are some ways… And that’s probably one of the… As you mentioned, that’s just one of a few buy-in challenges. What are some of the ways that you maybe win people over or show people how this can benefit their work life versus just make it harder?
BS: That’s a good question. I think that authors in general to understand where their content is going and who is consuming it. And even though it’s… We’re talking about corporate content, we’re talking about everything from website content to product manuals to troubleshooting tips and all that stuff and training materials. So it’s not really… Even though it belongs to the company, a lot of authors tend to have a kind of, I guess, personal pride built around what they write.
CC: Yeah, okay. Yeah, that makes sense.
BS: So knowing who is consuming it down the road and the reason why you have these additional checkpoints and processes in place will kind of help, I think get a lot of them around the idea of, yeah, this is a good thing and I’m looking forward to helping any way I can. Because the last thing they want is to have something written completely correctly in English and have it go out to, I guess let’s say a market in Denmark. And the content was translated incorrectly because the translator maybe didn’t understand what something meant, and they gave it a different term, which had a different meaning in that market.
CC: Yeah. And I could also see from a safety standpoint, that could be really dangerous too, if you’re not properly translating instructions for high stakes content, medical devices, stuff like that. Just like you do in English, you want that content to be accurate and understandable. Because if it’s not accurate, of course it’s wrong and people could get hurt also if people don’t understand it, even if it’s totally accurate. But it’s just hard to understand. That presents, I’m sure, a lot of dangerous situations where your people could get hurt and your company is liable. So yeah, it makes sense that you would really want to have a good process in place.
BS: Oh, absolutely. And even more along those lines, the regulations that we have to adhere to here in the US are somewhat different to… Very different to anywhere else in the world. There are different directives in place depending on where you are regionally, things that have to be included that have to be said a very specific way. So I guess the easiest way to look at it is that there are more legal ramifications in the US. So you could get sued if something is wrong, whereas opposed to if you go over to the UK, it’s generally more that there’s a directive you have to follow and you simply cannot release in that market if your, for example, machinery content does not meet that specific directive’s requirements. So there’s a slightly different approach. So it might be… There’s still a legal ramification if things go wrong, but there’s also another set of requirements that need to be met before you even start worrying about the legal stuff.
CC: And are most organizations aware of those kind of requirements when they start trying to get into a new market?
BS: Some of them might be, but again, if you’re in one particular region, chances are that’s the region you’ve grown up with and that’s the region you understand. And there’s been very little attention paid to what are their requirements in other geographic regions, other countries and so forth. So I can’t say is it common, is it not common? But in general, you know what? And when you’re looking to move to a foreign market, there’s the foreign context. You’re going to have very little insight into what that foreign market demands by its very nature. As a company moves into a new language market, new geographic market, they’re going to learn things as they go, and they’re going to bring that knowledge back and refine how things are being done currently so that it also satisfies that new requirement. And it’s going to be an iterative process until they really get their arms around it. And again, going back to a localization strategy for your content, you can kind of start putting those feelers out. Because if one market has one set of requirements, it’s like, wait a minute, now we want to go to three. What are the requirements for the other two before we even start thinking in that direction? So you’re able to start building upon that strategy that you’re developing. I mean, we’re not experts in all the requirements for every single market on the face of the earth. I can say that outright, but we can help companies start to identify what they need to start looking into before they start running.
CC: So since we mentioned one of the reasons this topic came about was seeing some SEO search trends, people trying to get more information on localization. What other trends are you seeing in localization right now?
BS: I think the big one is still going to be machine translation. It’s continually evolving and it’s getting smarter, still not, I would say, better than a human. It’s certainly quicker, but we’re getting there. And a lot of that… We talk about AI a lot. And obligatory nod to AI for this podcast, but when we talk about AI, and I think I mentioned this on another podcast already, that when you look at machine translation, that was really like AI Alpha or AI Beta where it was already using an algorithm to start putting together translations for written text. So with AI in the mix now, we’re getting a lot more, I guess, interesting results, a lot more targeted results with machine translation. I still don’t think it’s a perfect solution, and we’ll certainly need some proofreading, but it’s come a long way. And I think that that trend is certainly not going to fall off the radar anytime soon. In fact, recently Sarah O’Keefe had a podcast with Sebastian Göttel about strategies for AI and technical documentation, and they actually recorded that podcast in German. And they used AI to translate and voice augment into English. So not only were things machine translated from German into English, but the German speaking was then synthetically reproduced in English, which just is really cool.
CC: Yeah, it’s super cool to listen to, and we’ll link those in the show notes as well. There’s two versions, the German version and the English version. But yeah, you’re right. It was a super cool process, but you had mentioned earlier there was a human piece to it that was still needed because when it was originally recorded in German, then we got the German transcript, translated that into English. And when we translated that, at first it was Google Translate just to get it all done, but then Sarah needed to go and check it because she speaks both English and German. And we needed that human element to make sure that the translation was correct. Because like you were saying, you can’t just necessarily put it into a machine and cool, yay, it’s done. We need the human to make sure that it was actually translated properly and the things make sense. And we did notice once Sebastian’s synthetic audio was created in English, a lot of the prompts or the questions just were different lengths. The English version sometimes was shorter or sometimes longer of just the exact same question. It’s just the languages are different. So it’s really cool. It was a really cool experiment and does open up some interesting possibilities, would you say, for localization. And we’ve never been able to have a German and English podcast before, so that’s kind of cool.
BS: Yeah, no, it was very cool. I sat in the back of the room just watching the entire process, but it was definitely something I was quite interested in seeing. Yeah, there was a lot of editing of the English translation because again, it was pure machine translation and it needed some help. But once that was done, the synthetic audio really came right together, and I was impressed in how that happened.
CC: Yeah. And it’s so interesting because it’s definitely… It sounds like Sebastian, but then also it sounds not quite human, but it’s really close. It’s really interesting. But it did-
BS: Very uncanny valley.
CC: Yeah, it was, and I only speak English. I don’t speak German, so it made that podcast accessible to me. I was able to listen to it, and it does present some interesting opportunities, but as always with AI, the human element was definitely needed. It was very important to make sure that the humans at the other end of the screen could eventually consume it.
BS: Oh, yeah.
CC: Awesome. Well, bill, thank you so much. We covered a lot of ground today, and we really appreciate it. This was really helpful, and yeah, thanks for being on the show.
BS: Yeah, thanks.
CC: And thank you for listening to the Content Strategy Experts podcast, brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
The post Accelerate global growth with a content localization strategy appeared first on Scriptorium.
In episode 169 of The Content Strategy Experts podcast, Sarah O’Keefe and special guest Sebastian Göttel of Quanos engage in a captivating conversation on generative AI and its impact on technical documentation. To bring these concepts to life, this English version of the podcast was created with the support of AI transcription and translation tools!
Sarah O’Keefe: So what does AI have to do with poems?
Sebastian Göttel: You often have the impression that AI creates knowledge; that is, creates information out of nothing. And the question is, is that really the case? I think it is quite normal for German scholars to not only look at the text at hand, but also to read between the lines and allow the cultural subtext to flow. From the perspective of scholars of German literature, generative AI actually only interprets or reconstructs information that already exists. Maybe it’s hidden, only implicitly hinted at. But this then becomes visible through the AI.
How this podcast was produced:
This podcast was originally recorded in German by Sarah and Sebastian, then Sarah edited the audio. Sebastian used Whisper, Open AI’s speech-to-text tool to transcribe the German recording, followed by necessary revisions. The revised German transcript was machine translated into English via Google Translate and then we cleaned up the English transcription.
Sebastian used ElevenLabs to generate a synthetic audio track from the English transcript. Sarah re-recorded her responses in English and then we combined the two recordings to produce the composite English podcast.
Related links:
LinkedIn:
Transcript:
Sarah O’Keefe: Today’s episode is available in English and German. Since our guest works with AI in German-speaking countries, we had the idea to create this podcast in German. The English version was then put together with AI support, particularly synthetic audio. So welcome to the Content Strategy Experts Podcast, today offered for the first time in German and English. Our topic today is Information compression instead of knowledge creation: Strategies for AI in technical documentation. In the German version, we tried to put it all together in one nice long word, but it didn’t quite work. Welcome to the Content Strategy Experts Podcast, brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize, and distribute content in an efficient way. In this episode, we talk about best practices for AI and tech comm with our guest Sebastian Göttel of Quanos. Hello everyone, my name is Sarah O’Keefe. I am the CEO here at Scriptorium. My guest is Sebastian Göttel. Sebastian Göttel has been working in the area of XML and editorial CCMS systems in technical documentation for over 25 years. He originally studied computer science with a focus on AI. Currently, he is Product Manager for Schema ST4 at Quanos, one of the most used editorial systems in machinery and industrial engineering in the German-speaking regions. He is also active in Tekom and, among other things, contributed to version 1 of the iiRDS standard. Sebastian lives with his wife and daughter, three cats, and two mice just outside Nuremberg. Sebastian, welcome. I look forward to our discussion. In English, we say create once, publish everywhere. This is about recording once and outputting multiple times. So, off we go. Sebastian, our topic today is, as I said, information consolidation instead of knowledge creation and how this strategy could be used for AI in technical documentation. So please, explain.
Sebastian Göttel: Yes, first of all thank you for inviting me to the podcast. It’s not that easy to impress a 14-year-old daughter. And I thought, with this podcast I have a chance. So I told her that I would be talking about AI on an American podcast soon. And the reaction was a little different than I expected. Youuuuu will you speak English? You can put quite a lot of meaning into a single “uuuu” like that. And that’s why I’m glad that I can speak German here. But, and this is now the transition to the topic, what will the AI make of the “You will speak English”? How does it want to pronounce that correctly in text-to-speech or translate it into another language? And that’s what I think our conversation will be about today. If we want to understand how AI understands us, but also how we can use it in technical documentation, then we have to talk about information compression, but also invisible information. “You will speak English?” Can the AI conceptualize that my daughter doesn’t trust me to do this or simply finds my German accent in English gross? Well, if the AI can understand that, then it is new information or actually information that was already there and that both father and daughter were actually aware of during the conversation. I find it quite exciting that German scholars have often dealt with this. Namely, what is in such a text, and what is meant in the text? What’s between the lines? And when you think back to your school days, these interpretations of poems immediately come to mind.
SO: So poems. And what does AI have to do with poems?
SG: Yes, well, you often have the impression that AI creates knowledge; that is, creates information out of nothing. And the question is, is that really the case? I think it is quite normal for German scholars to not only look at the text at hand, but also to read between the lines and allow the cultural subtext to flow. And from the perspective of scholars of German literature, generative AI actually only interprets or reconstructs information that already exists. Maybe it’s hidden, only implicitly hinted at. But this then becomes visible through the AI. Wow, I never thought I would refer to German literature scholarship in a technical podcast.
SO: Yes, and me neither. But the question remains, how does AI work and why does it work? And then why do these problems exist? What is our understanding of the situation today?
SG: Well, I think we’re still pretty impressed by generative AI, and we’re still trying to understand what we’re actually perceiving and what’s happening there. There are things that just make our jaws drop. And then there are those epic fails again, like this recent representation of World War II German soldiers by Gemini, Google’s generative AI. According to our current understanding, the soldiers were politically correct. And there were, among other things, Asian-looking women with steel helmets. I always like to compare this with the beginnings of navigation systems. There were always these anecdotes in the newspaper about someone driving into the river because their navigation system mistook the ferry line for a bridge. It was relatively easy to fix such an error in the navigation system. It was clear why the navigation system made the mistake. Unfortunately, with generative AI it’s not that easy. We don’t know, actually, we haven’t even really understood how these partially intelligent achievements come about. But the epic fails make us aware that it’s not an algorithm, but a phenomenon that seems to emerge if you pack many billions of text fragments into a matrix.
SO: And what do you mean here by “emerge”?
SG: That is a term from natural science. I once compared it to water molecules. A single water molecule isn’t particularly spectacular, but if, for example, you’re sailing in a storm on the Atlantic or hitting an iceberg, you get a different perspective. Because if you put many water molecules together, completely new behavior emerges. And it took physics and chemistry many centuries to partially unravel this. And I think we will, maybe not for quite as long, but we will have to do a lot more research into generative AI in order to understand a little more about what exactly is happening. And I think the epic fails should make us aware that we would currently do well not to blindly place our fate in the hands of a Large Language Model. I think the human-in-the-loop approach, where the AI makes a suggestion and then a human looks at it again, remains the best mode for the time being. The translation industry, which feels like it is a few years ahead of the world when it comes to generative AI or neural networks, has recognized this quite cleverly and implemented it profitably.
SO: And if translation is the model, what does this mean for generative AI and technical documentation?
SG: That’s a good question. Let’s take a step back. So at the beginning of my working life, there was a revolution in technical documentation, these were structured documents; SGML and XML. This has been known for several decades now, and it is still not used in every editorial team. And that means we now have these structured documents and the other thing, which are the nasty unstructured documents. I always thought that was a bit of a misnomer because unstructured documents are actually structured. Well, at least most of the time. There’s a macro level where I have a table of contents, a title page, and an index. There are chapters. Then there are paragraphs, lists, and tables and that goes down to the sentence level. I have lists, prompts, and so on. It’s not for nothing that some linguists call this text structure. And if I now approach XML, the beauty of XML is that I can now suddenly make this implicit structure explicit. And the computer can then calculate with our texts. Because if we’re being honest, in the end, XML is not for us, but for the machine.
SO: Is it possible then that AI can discover structures that, for us humans, have so far only been expressed through XML?
SG: Yes. Well, I recently looked into Invisible XML. There you can overlay patterns onto unstructured text and they become visible as XML. Very clever. I think generative AI is a kind of Invisible XML on steroids. The rules aren’t as strict as in Invisible XML, but genAI also understands linguistic nuances. I found it very exciting, a customer of ours fed unstructured PDF content into ChatGPT; that is unstructured content from the PDF, in order to then convert it to XML. The AI was surprisingly good at discovering the invisible structure that was hidden in the content and converted XML really well. So that was impressive. When AI now appears to create information out of nothing, I think it is more likely that it makes existing but hidden information visible.
SO: Yes, I think the problem is that this hidden structure, in some documents, it’s there, but in others, there’s what we call “crap on a page” in English. So that’s, there’s no structure. And from one document to another, there is no consistency, so they are completely different. Writer 1 and Writer 2, they write and they never talk. And so if the AI now creates an entire chapter and an outline from a few keywords, how does it work? How does that fit together?
SG: Yes, you’re right. So far we’ve been talking about we take PDF and then XML is added to it. But if I’m put on the spot, I’ll throw in a few keywords and ChatGPT suddenly writes something. But also, I think this idea also applies that this is actually hidden information. It might sound a bit daring at first, but there’s nothing new, nothing completely surprising. Now if I just ask, let’s say ChatGPT, give me an outline for documentation for a piece of machinery. And then something comes out. I think most of our listeners would say the same thing. This is nothing new. This is hidden information contained in the training data, which is easily made visible through the query. Because ultimately, generative AI creates this information from my query and this huge amount of training data. And the answer is chosen so that it fits my query and the training data well. It creates a synthetic layer over the top. And in the end, the result is not net new information, but hopefully, the necessary information delivered in a way that’s easier to process further. Either like the example with PDF, enriched with XML or I maybe now have an outline. And I imagine it’s a bit like a juicer. The juicer doesn’t invent juice, it just extracts it from the oranges.
SO: Making information easier to process sounds almost like a job description for technical writers. And what about other methods? So if we now have metadata or knowledge graphs, what does that look like?
SG: That’s right, in addition to XML, these are also really important. So metadata, knowledge graphs. I find that metadata condenses information into a few data points and the knowledge graphs then create the relationships among these data points. And this is precisely why knowledge graphs, but also metadata, make invisible information visible. Because the connections that were previously implicit can now be understood through the knowledge graphs. And that can be easily combined with generative AI. At the beginning, the knowledge graph experts were a bit nervous, as you could tell at conferences, but now they’re actually pretty happy that they’ve discovered that generative AI plus knowledge graphs is much better than generative AI without knowledge graphs. And of course, that’s great. By the way, this isn’t the only trick where we have something in the technical documentation that helps generative AI get going. If you want to make large knowledge bases searchable with Large Language Models, you can do that today with RAG, or Retrieval Augmented Generation. And this means you can combine your own documents with a pre-trained model like ChatGPT very cost-effectively. If you now combine RAG with a faceted search, as we usually have in the content delivery portals in technical documentation, then the results are much better than with the usual vector search, because in the end it is just a better full-text search. That’s another possibility where structured information that we have can help jump-start AI.
SO: Is it your opinion that structured information will not become obsolete through AI, but will actually become more important?
SG: My impression is that the belief has taken hold that structured information is better for AI. I think we’re all a bit biased, naturally. We have to believe that. These are the fruits of our labor. It’s a bit like apples. The apple from an organic farmer is obviously healthier than the conventional apple from the supermarket. I think this is scientific fact. But in the end, any apple is better than a pack of gummy bears. And that’s what can be so disruptive about AI for us. Because at the end of the day, we are providing information. And if users gets information that is sufficient, that is good enough, why should they go the extra mile to get even better information? I don’t know.
SO: Okay, so I’m really interested in this gummy bear career and I want to hear a little bit more about that. But why is your view on the tech comm team’s role so, let’s say, pessimistic?
SG: I think my focus has gotten a little wider recently. I think I’m not really just looking at technical documentation. When it comes to technical documentation, we are lost without structured data. It will not work. But if we take the bigger picture, at Quanos we not only have an CCMS, but we also create a digital twin for information. I’m in all these working groups as the guy from the tech doc area. And I always have to accept that our particularly well-structured information from tech doc, the one with extra vitamins and secondary nutrients, is actually the exception out there when we look at the data silos that we want to combine in the info twin. When I was young, I believed that we had to convince others to work the way we do in tech docs. That would have been really fantastic. But if we’re honest with ourselves, it just doesn’t work. The advantages that XML provides for technical documentation are too small in the other areas and for individuals to justify a switch. The exceptions prove the rule. As a result, tons of information is out there locked up in these unstructured formats. And it can only be made accessible with AI. That will be the key.
SO: And how do we do that? If XML isn’t the right strategy, what does that look like?
SG: Well, so let’s take an example. So many of our customers build machinery and let’s take a look at the documentation that they supply. There are several dozen PDFs for each order. And of course the editor has a checklist and knows what to look for in this pile of PDFs. The test certificate, the maintenance table, parts lists, and so on. And even though the PDFs are completely “unstructured” as compared to XML files, we humans are able to extract the necessary information. And the exciting thing about it is that anyone can actually do it. So you don’t have to be a specialist in bottling systems or industrial pumps or sorting machines. If you have an idea of what a test certificate, a maintenance table, a parts list is, then you can find it. And here’s the kicker: the AI can do that too.
SO: Ahh. And so in this case are you more concerned with metadata…or something else?
SG: No, you’re right. So this is in fact about metadata and links. I find it fascinating what this does to our language usage. Because we have gotten used to saying that we enrich the content with metadata. But in many cases we have simply made the invisible structure explicit. No information was added. Nothing has become richer, just clearer. But now imagine that your supplier didn’t provide a maintenance table. Then you need to start reading, understand the maintenance instructions, and extract the necessary information. And that’s tedious. Even here, AI can still provide support. But how well depends on the clarity of maintenance procedures. The more specific background knowledge is necessary, the more difficult it becomes for the AI to provide assistance.
SO: What does that look like? Do you have an example or use case where AI doesn’t help at all?
SG: It depends on contextual knowledge. I once received parts of a risk analysis from a customer. And her question was, “Can you use AI to create safety messages?” And I said, “Sure, look at the risk analysis and then look at what the technical writers made of it.” And they were exemplary safety messages. But there was so little content in the risk analysis that with the best intentions in the world you couldn’t do anything with artificial intelligence; that end result was only possible because the technical writers had an incredibly good understanding of the product and also had the industry standards. The information was not hidden in this input, but in the contextual knowledge. And that’s so specialized that it’s of course not available in the Large Language Model.
SO: In this use case, you don’t see any possibility for AI at all?
SG: Well, at least not for a generic Large Language Model. So something like ChatGPT or Claude, they have no chance. There is an opportunity in AI to specialize these models again. You can fine-tune this with context-specific content. But we don’t yet know at the moment whether we normally have enough content. There are some initial experiments. But let’s think back to the water molecules. We need quite a few of them to make an iceberg or even a snowman. Ultimately, you have to ask which supporting materials are needed from which point of view, and fine-tuning is really expensive. So there are costs. It takes a long time. Performance is also an issue. And how practical is this approach? Do we have training data? So, given all these aspects, it is still unclear what the gold standard is for making a generic large language model usable for content work in very specific contexts. We just don’t know today.
SO: Can you already see or predict how generative AI will change or must change technical documentation?
SG: I really think it’s more like looking into my crystal ball. So it’s not that easy to estimate which use cases are promising for the use of AI in technical documentation. As a rule, you have a task where a textual input needs to be transformed into a textual output according to a certain standard. And it used to be garbage in, garbage out. In my opinion, the Large Language Models change this equation permanently. Input that we were previously unable to process automatically due to a lack of information density, we can now enrich it with universal contextual knowledge in such a way that it becomes processable. Missing information cannot be added. We’ve discussed that now. But these unspoken assumptions, in fact, we can pack them in. And that helps us in many places in technical documentation, because one of the ways good technical documentation differs from bad documentation is that fewer assumptions are necessary in order to understand the text or if you want to process it automatically. And that’s why I find condensing information instead of creating knowledge to be a kind of Occam’s Razor. I look at the assignment. If it’s simply a matter of making hidden information visible or putting it into a different form, then this is a good candidate for generative AI. What if it’s more about refining the information by using other sources of information? Then it becomes more difficult. If I now have this information, this other information in a knowledge graph, if it is already broken down there, then I can explicitly enrich the information before handing it over to the Large Language Model. And then it works again. But if the information, for example, the inherent product knowledge, is in the editor’s head, as was the case with my client’s risk analysis, then the Large Language Model simply has no chance. It won’t generate any added value. Then you may have to rethink your approach. Can you divide the task somehow? Maybe there is a part where this knowledge is not necessary, and I have an upstream or downstream process where I can optimize something with AI. And I think that’s the mother lode of opportunities lies. This art of distinguishing what is possible from what is impossible, and this will be more of a kind of engineering art, will be the factor in the coming years that will decide whether generative AI is of use to me or not.
SO: And what do you think? Of use, or not of use?
SG: I think we’ll figure it out. But it will take much longer than we think.
SO: Yes, I think that’s true. And so thank you very much, Sebastian. These are really very interesting perspectives and I’m looking forward to our next discussion, when in two weeks or three months there will be something completely new in AI and we’ll have to talk about it again, yes, what can we do today or what new things are available? So thank you very much and see you soon!
SG: … soon somewhere on this planet.
SO: Somewhere.
SG: Thank you for the invitation. Take care, Sarah.
SO: Yes, thank you, and many thanks to those listening, especially for the first time in the German-speaking areas. Further information about how we produced this podcast is available at scriptorium.com. Thank you for listening to the Content Strategy Experts Podcast, brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
The post Strategies for AI in technical documentation (podcast, English version) appeared first on Scriptorium.
Folge 169 ist auf Englisch und Deutsch verfügbar. Da unser Gast Sebastian Göttel sich im deutschsprachigen Raum mit KI beschäftigt, kam die Idee, diesen Podcast auf Deutsch zu erstellen. Die englische Version wurde dann mit KI-Unterstützung zusammengebastelt.
Sarah O’Keefe: Was hat die generative KI mit Gedichtinterpretationen zu tun?
Sebastian Göttel: Ja, nun, also oft hat man da ja den Eindruck, dass KI das Wissen schöpft, also Informationen aus dem Nichts erschafft. Und da ist die Frage, ist das denn wirklich so? Denn für die Germanisten ist es, glaube ich, schon eher normal, nicht nur den vorliegenden Text anzuschauen, sondern auch zwischen den Zeilen zu lesen, den kulturellen Subtext einfließen zu lassen. Und aus dem Blickwinkel der Germanisten, interpretiert oder rekonstruiert generative KI eigentlich nur Informationen, die schon vorhanden ist. Möglicherweise ist die verborgen, nur implizit angedeutet. Aber die wird durch die KI dann sichtbar.
Related links:
LinkedIn:
Transcript:
Sarah O’Keefe: Die heutige Episode ist auf Englisch und Deutsch verfügbar. Da unser Gast sich im deutschsprachigen Raum mit KI beschäftigt, kam die Idee, diesen Podcast auf Deutsch zu erstellen. Die englische Version wurde dann mit KI-Unterstützung zusammengebastelt. Also herzlich willkommen zum Content Strategy Experts Podcast, heute zum ersten Mal auf Deutsch. Unser Thema ist heute Informationsverdichtung statt Wissensschöpfung. Strategien für KI in der technischen Dokumentation. Wir haben versucht, das alles in ein Wort zusammenzubringen, das hat aber nicht ganz geklappt. Welcome to the Content Strategy Experts Podcast, brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize and distribute content in an efficient way. In this episode, we talk about best practices for AI and TechCom with our guest Sebastian Göttel of Quanos. Hallo, ich heiße Sarah O’Keefe. Ich bin hier bei Scriptorium die Geschäftsführerin. Mein Gast ist Sebastian Göttel. Sebastian Göttel arbeitet seit über 25 Jahren im Bereich XML und Redaktionssysteme in der technischen Dokumentation. Ursprünglich hat er mal Informatik mit Schwerpunkt KI studiert. Aktuell ist er bei Quanos Product Manager für Schema ST4, einem der meistgenutzten Redaktionssysteme im Maschinen- und Anlagenbau in DACH. Er ist auch in der Tekom aktiv und hat unter anderem an der Version 1 des iiRDS-Standards mitgewirkt. Sebastian lebt mit Frau und Tochter, drei Katzen und zwei Mäusen vor den Toren von Nürnberg. Sebastian, herzlich willkommen. Ich freue mich auf diesen Austausch. Auf Englisch sagen wir ja create once, publish everywhere. Hier geht es um einmal aufnehmen und mehrfach ausgeben. Also, los geht’s. Sebastian, unser Thema ist heute, wie gesagt, die Informationsverdichtung anstatt von Wissensschöpfung. Und wie diese Strategie für KI in der technischen Dokumentation eingesetzt werden könnte. Also, bitte, erklär doch mal.
Sebastian Göttel: Ja, erstmal vielen Dank für die Einladung in den Podcast. Es ist ja gar nicht so einfach, eine 14-jährige Tochter zu beeindrucken. Und ich dachte mir, mit diesem Podcast habe ich eine Chance. Also habe ich ihr erzählt, dass ich demnächst in einem amerikanischen Podcast über KI sprechen werde. Und die Reaktion war ein bisschen anders, als ich mir das erwartet habe. Duuu wirst da Englisch sprechen? Man kann schon ziemlich viel Bedeutung in so ein einzelnes Duuu legen. Und von daher bin ich zum einen froh, dass ich hier Deutsch sprechen darf. Aber, und das ist jetzt die Überleitung zum Thema, was wird die KI aus dem “Duuu wirst da Englisch sprechen” machen? Wie will sie das beim Text-to-Speech korrekt aussprechen oder in eine andere Sprache übertragen? Und darum, glaube ich, wird es in unserem Gespräch heute gehen. Wenn wir verstehen wollen, wie KI uns versteht, aber auch wie wir sie in der technischen Dokumentation einsetzen können, dann müssen wir über Informationsverdichtung, aber auch unsichtbare Informationen sprechen. Duuu wirst da Englisch sprechen. Kann die KI rekonstruieren, dass meine Tochter mir das nicht zutraut beziehungsweise meinen deutschen Akzent im Englischen einfach grottig findet? Naja, also wenn die KI das rekonstruieren kann, ist es dann neue Information oder eigentlich eher Information, die schon da, war und die eigentlich im Gespräch sowohl Vater als auch Tochter bewusst war. Ich finde das ziemlich spannend, dass die Germanisten sich damit schon ganz häufig beschäftigt haben. Nämlich, was steht in so einem Text drin und was ist in dem Text gemeint? Was steht zwischen den Zeilen? Und wenn man so an seine Schulzeit zurückdenkt, dann fallen einem ja sofort diese Gedichtinterpretationen ein.
SO: Also Gedichte und was hat die generative KI mit Gedichtinterpretationen zu tun?
SG: Ja, nun, also oft hat man da ja den Eindruck, dass KI das Wissen schöpft, also Informationen aus dem Nichts erschafft. Und da ist die Frage, ist das denn wirklich so?
Denn für die Germanisten ist es, glaube ich, schon eher normal, nicht nur den vorliegenden Text anzuschauen, sondern auch zwischen den Zeilen zu lesen, den kulturellen Subtext einfließen zu lassen. Und aus dem Blickwinkel der Germanisten, interpretiert oder rekonstruiert generative KI eigentlich nur Informationen, die schon vorhanden ist. Möglicherweise ist die verborgen, nur implizit angedeutet. Aber die wird durch die KI dann sichtbar. Ui, hätte nie gedacht, dass ich mal in einem technischen Podcast mich auf Germanisten berufe.
SO: Ja, und ich auch nicht. Da bleibt aber doch die Frage, wie funktioniert das? Also wie funktioniert die KI und warum funktioniert das? Und wieso gibt es dann diese Probleme? Was ist denn heute unser Verständnis von der Lage?
SG: Also ich glaube, wir sind immer noch ziemlich beeindruckt von der generativen KI und wir versuchen noch zu begreifen, was wir da überhaupt wahrnehmen, was da passiert. Da gibt es Dinge, die lassen uns einfach den Kiefer runterklappen. Und dann gibt es wieder diese Epic Fails, wie vor kurzem diese Darstellung von Wehrmachtsoldaten von Gemini, der generativen KI von Google. Die Soldaten waren nämlich nach unserer heutigen Vorstellung politisch korrekt. Und da gab es dann unter anderem asiatisch aussehende Frauen mit Stahlhelm. Ich vergleiche das immer so ganz gern mit den Anfängen der Navigationssysteme. Da gab es ja auch immer diese Anekdoten in der Zeitung, dass wieder jemand in den Fluss gefahren ist, weil sein Navi die Fährlinie für eine Brücke gehalten hat. So einen Fehler konnte man im Navigationssystem relativ einfach fixen. Da war klar, warum das Navi den Fehler gemacht hat. Bei der generativen KI ist das leider nicht ganz so einfach. Wir wissen nicht, eigentlich, wir haben es noch nicht mal wirklich verstanden, wie diese teilweise intelligenten Leistungen zustande kommen.Die Epic Fails, die machen uns aber bewusst, dass es sich nicht um einen Algorithmus handelt, sondern um ein Phänomen, das scheinbar emergiert, wenn man viele Milliarden Texte in eine Matrix packt.
SO: Und was meinst du da mit emergiert? Was ist das denn?
SG: Das ist ein Begriff aus der Naturwissenschaft. Ich habe das mal mit Wassermolekülen verglichen. Ein einzelnes Wassermolekül ist nicht sonderlich spektakulär, aber wenn du zum Beispiel im Segelboot in einem Sturm auf dem Atlantik unterwegs bist oder auf einen Eisberg aufläufst, dann kriegst du eine andere Perspektive. Denn viele Wassermoleküle zusammengenommen zeigen ganz neues Verhalten. Und das nennt man Emersion. Und Physik und Chemie haben viele Jahrhunderte gebraucht, um das halbwegs zu enträtseln. Und ich denke, wir werden, vielleicht nicht ganz so lange, aber wir werden noch ein gutes Stück weiterforschen müssen bei der generativen KI, um auch da ein bisschen mehr zu verstehen, was da jetzt genau passiert. Und ich finde, die Epic Fails, die sollten uns bewusst machen, dass wir aktuell gut daran tun, unser Schicksal nicht blind in die Hände eines Large Language Models zu legen. Ich finde, der Ansatz Human in the Loop, wo die KI einen Vorschlag macht und dann ein Mensch nochmal drüber schaut, das bleibt bis auf weiteres der beste Modus. Und die Übersetzerbranche, die gefühlt der ganzen Welt ein paar Jahre voraus ist, wenn es um generative KI geht oder um neuronale Netze, die hat das ziemlich klug erkannt und gewinnbringend umgesetzt.
SO: Und wenn also jetzt die Übersetzung das Muster ist, was heißt das dann für generative KI und die technische Doku?
SG: Das ist eine gute Frage. Lass uns mal einen Schritt zurück machen. Also am Anfang meines Arbeitslebens, da war die Revolution in der technischen Dokumentation, das waren diese strukturierten Dokumente. SGML und XML. Und das kennt man jetzt also mittlerweile schon seit mehreren Jahrzehnten und es ist ja immer noch nicht in jeder Redaktion gebräuchlich. Und das heißt, wir haben jetzt diese strukturierten Dokumente und das andere, das sind die bösen unstrukturierten Dokumente. Und ich fand das schon immer so ein kleines bisschen einen Etikettenschwindel, denn unstrukturierte Dokumente sind ja in Wirklichkeit auch strukturiert. Also meistens zumindest. Da gibt es so eine Makro-Ebene, da habe ich ein Inhaltsverzeichnis, ein Titelblatt, ein Stichwortverzeichnis. Es gibt Kapitel. Dann gibt es Absätze, Listen und Tabellen und das geht dann runter bis auf die Satzebene. Da habe ich Aufzählungen, Aufforderungen und so weiter. Und nicht umsonst nennen das manche Linguisten ja Textstruktur. Und wenn ich jetzt mit XML rangehe, das Schöne daran an XML ist, dass ich diese implizite Struktur nun plötzlich explizit mache. Und damit kann dann der Computer mit unseren Texten rechnen. Denn wenn man ehrlich ist, am Ende ist XML nicht für uns, sondern für die Maschine.
SO: Kann es dann sein, dass die KI Strukturen entdecken kann, die für uns Menschen bis jetzt zwangsweise nur durch XML ausgedrückt wurden?
SG: Ja. Also ich habe mich da mal vor kurzem mit Invisible XML beschäftigt und da kann man über unstrukturierten Text Muster legen und die werden dann als XML sichtbar gemacht. Ganz clever. Und ich finde generative KI ist so eine Art Hochleistungs-Invisible XML. Also weil es zwar nicht so ganz strikt wie Invisible XML Regeln enthält, aber dafür auch sprachliche Nuancen versteht. Und ich fand es ganz spannend, ein Kunde von uns, der hat unstrukturierte PDF-Inhalte in Chat-GPT gefüttert, also unstrukturierte Inhalte aus dem PDF, um sie nach XML dann zu konvertieren. Und die KI hat erstaunlich gut die unsichtbare Struktur entdeckt, die in den Texten verborgen war und echt prima XML konvertiert. Also das war beeindruckend. Also wenn KI jetzt scheinbar Informationen aus dem Nichts schafft, dann ist es eben eher so, dass es existierende, aber verborgene Informationen sichtbar macht.
SO: Ja, ich glaube das Problem ist ja so, dass diese verborgene Struktur, also in manchen Dokumenten ist das da, aber in anderen da ist das, was wir auf Englisch, bei uns heißt das Crap on a Page. Das ist also, da gibt es keine Struktur. Und von einem Dokument zum anderen, da gibt es keine, also keine, die sind ganz anders. Also Redakteur 1 und Redakteur 2, die schreiben und die unterhalten sich niemals. Und also wenn die KI jetzt aus ein paar Stichworten ein ganzes Kapitel und eine Gliederung erstellt, wie geht das? Wie passt das zusammen?
SG: Ja, du hast recht. Jetzt haben wir die ganze Zeit drüber geredet: Wir nehmen PDF und dann wird da XML noch dazu gepackt. Aber wenn ich jetzt hier an der Stelle bin und sage, ich haue mal ein paar Stichworte rein und ChatGPT schreibt dann plötzlich etwas. Aber auch, ich finde auch da gilt dieser Gedanke, dass das eigentlich verborgene Information ist. Klingt vielleicht zuerst mal ein bisschen gewagt, aber da entsteht nichts Neues, nichts völlig Überraschendes. Wenn ich jetzt, sagen wir mal ChatGPT, einfach frage, gib mir mal eine Gliederung für eine Maschinen-Dokumentation. Und dann kommt da was raus. Ich denke, das würden die meisten von unseren Zuhörern genauso hinschreiben. Das ist nichts Neues. Das ist versteckte Information, die in den Trainingsdaten steckt, die durch die Anfrage einfach sichtbar gemacht werden. Denn letztendlich erstellt die generative KI diese Information aus meiner Anfrage und dieser riesigen Menge an Trainingsdaten. Und die Antwort, die ist so gewählt, dass sie gut zu meiner Anfrage und den Trainingsdaten passt. Die, ja, legt sich so ein bisschen wie so ein Layer da drüber, sodass das einfach gut das simuliert. Und am Ende habe ich damit dann keine neue Information, sondern hoffentlich die benötigte Information in einer besser verarbeitbaren Form. Entweder wie vorhin beim Beispiel mit dem PDF, mit XML angereichert oder ich habe jetzt eine Gliederung. Und ein bisschen stelle ich mir das vor wie so bei einer Saftpresse. Ja, die erfindet den Saft ja auch nicht, sondern die holt das aus den Orangen einfach raus.
SO: Informationen besser verarbeitbar zu machen, also das klingt doch fast schon wie eine Tätigkeitsbeschreibung für technische Redakteure. Und was ist mit anderen Methoden? Also wenn wir jetzt Metadaten oder Knowledge Graphs haben, wie sieht das denn da aus?
SG: Stimmt, das ist neben XML natürlich auch total wichtig. Also Metadaten, Knowledge Graphen. Ich finde Metadaten, die verdichten Informationen auf wenige Datenpunkte und die Knowledge Graphen, die machen dann die Beziehungen zwischen diesen Datenpunkten. Und gerade dadurch machen Knowledge Graphen, aber auch Metadaten ja unsichtbare Informationen sichtbar. Denn die Zusammenhänge, die vorher implizit wahr waren, die können jetzt durch die Knowledge Graphen nachvollzogen werden. Und das lässt sich prima mit generativer KI kombinieren. Am Anfang waren die Knowledge Graph Experten ein bisschen nervös, das konnte man merken auf Konferenzen, aber jetzt sind sie eigentlich ziemlich froh, dass sie festgestellt haben, generative KI plus Knowledge Graphen, das ist viel besser als generative KI ohne Knowledge Graphen. Und das ist natürlich prima. Das ist übrigens nicht der einzige Trick, wo wir in der technischen Dokumentation etwas haben, was der generativen KI auf die Sprünge hilft. Wenn man mit Large Language Models große Wissensbasen durchsuchbar machen will, dann macht man das ja heutzutage mit RAG, also Retrieval Augmented Generation. Und damit kann man sehr kostengünstig eigene Dokumente mit einem vortrainierten Modell wie ChatGPT kombinieren. Und kombiniert man jetzt RAG mit einer Facettensuche, so wie wir das in den Content Delivery Portalen in der TechDoc normalerweise haben, dann sind die Ergebnisse viel besser als mit der üblichen Vektorsuche, denn die ist am Ende ja nur eine bessere Volltextsuche. Und das ist dann auch wieder eine Möglichkeit, wo strukturierte Informationen, die wir eben haben, der KI auf die Sprünge hilft.
SO: Also bist du dann auch der Meinung, dass die strukturierte Information, durch KI nicht obsolet wird, sondern sogar noch wichtiger wird?
SG: Ich habe schon den Eindruck, dass sich so ein bisschen der Glaube durchgesetzt hat, strukturierte Informationen sind besser für KI. Ein bisschen sind wir dann natürlich, glaube ich, alle biased. Also wir müssen das glauben. Das sind ja die Früchte unserer Arbeit. Ein bisschen ist es auch so, also genauso wie der Apfel vom Biobauern natürlich gesünder ist, als der konventionelle Apfel aus dem Supermarkt. Ich denke, das ist wissenschaftlich klar erwiesen. Aber am Ende ist ein Apfel immer besser als eine Packung Gummibärchen. Und das ist es, was bei KI so disruptiv sein kann für uns. Denn am Ende machen wir Informationsvermittlung. Und wenn der Anwender Informationen bekommt, die ausreicht, die gut genug ist, warum sollte er dann noch die Extra-Meile gehen, um noch bessere Informationen zu bekommen? Ich weiß nicht.
SO: Ja, also ich interessiere mich wirklich an dieser, also Gummibärchen-Karriere. Da will ich mal ein bisschen mehr hören. Aber warum ist das denn so, sagen wir mal, pessimistisch für die Redaktion von dir?
SG: Ich glaube, mein Bild ist ein bisschen größer geworden in der letzten Zeit. Ich glaube, da geht es mir gar nicht so sehr um die technische Dokumentation. In der technischen Dokumentation sind wir ohne strukturierte Daten aufgeschmissen. Das wird nicht funktionieren.
Aber wenn wir das größere Bild machen, bei Quanos haben wir ja nicht nur ein Redaktionssystem, sondern wir machen auch so einen digitalen Informationszwilling. Und dann sitze ich immer in diesen Arbeitskreisen drin als der Typ aus dem Tech-Doc-Bereich. Und da muss ich immer hinnehmen, dass unsere besonders gut strukturierten Informationen aus der Tech-Doc, also die mit den besonders viel Vitaminen und sekundären Pflanzenstoffen, das ist halt doch in der Realität da draußen eher die Ausnahme, wenn wir uns die Datensilos angucken, die wir im Info-Twin zusammenfahren wollen. Und als ich jung war, da habe ich noch dran geglaubt, dass wir die anderen davon überzeugen müssen, auch so zu arbeiten wie in der technischen Dokumentation. Das wäre doch echt prima gewesen. Aber wenn wir ehrlich sind, es klappt halt nicht. Die Vorteile, die wir in der technischen Dokumentation dank XML haben, die sind in den anderen Bereichen für die einzelnen Kollegen zu klein, als dass sie umsteigen wollen. Also Ausnahmen bestätigen die Regel. Das bedeutet, da draußen gibt es Tonnen von Informationen, die in diesen unstrukturierten Formaten eingesperrt sind. Und die können nur mit KI zugänglich gemacht werden. Das wird der Schlüssel sein.
SO: Und wie machen wir das? Also wenn jetzt XML da nicht der richtige Pfad ist, dann wie sieht das aus?
SG: Naja, also nehmen wir ein Beispiel. Also viele unserer Kunden sind ja Maschinenanlagenbauer und gucken wir mal auf die Zulieferdokumentation. Da kommen für einen Auftrag, mehrere Dutzende PDF. Und natürlich hat die Redakteurin dann so eine Checkliste und sie weiß, was sie in diesem Haufen PDF suchen muss. Das Prüfzertifikat, die Wartungstabelle, Ersatzteillisten und so weiter. Und obwohl die PDFs ja komplett unstrukturiert sind, also dieses unstrukturiert, wie wir halt als XML-Leute das dann so nennen, sind wir Menschen in der Lage, diese Informationen zu extrahieren. Und das Spannende daran, eigentlich kann das jeder. Also dafür muss man kein Spezialist für Abfüllanlagen oder Industriepumpen oder Sortiermaschinen sein. Wenn du eine Vorstellung davon hast, was ein Prüfzertifikat, eine Wartungstabelle, eine Ersatzteilliste ist, dann findest du die. Und jetzt kommt’s. Dann kann die KI das nämlich auch.
SO: Aha. Und also geht es dir in diesem Fall eher um Metadaten oder um was anderes?
SG: Nee, du hast schon recht. Also es geht hier in der Tat um Metadaten und Verlinkungen.
Ich finde das spannend, was das mit unserem Sprachgebrauch macht. Denn wir haben uns ja so angewöhnt zu sagen, wir reichern die Inhalte mit Metadaten an. Aber in vielen Fällen haben wir einfach nur die unsichtbare Struktur explizit gemacht. Da ist gar keine Information dazugekommen. Da ist nichts reicher geworden, sondern einfach nur klarer. Aber jetzt stell dir mal vor, dein Zulieferer hat keine Wartungstabelle geliefert. Dann musst du anfangen, die Wartungsarbeiten zu lesen, zu verstehen und die notwendigen Informationen zu extrahieren. Und das ist ziemlich mühsam. Selbst hier kann dann die KI noch unterstützen. Aber wie gut, hängt dann schon davon ab, wie verständlich die Wartungstätigkeiten beschrieben sind. Und umso mehr spezifisches Hintergrundwissen notwendig ist, umso schwieriger wird es, für die KI hilfreich zuzuarbeiten.
SO: Und wie sieht das denn aus? Also hast du ein Beispiel oder ein Use Case, wo die KI gar nicht weiterhilft?
SG: Wie schon gesagt, das hängt natürlich dann vom Kontextwissen ab. Ich hatte von einer Kundin mal Teile der Risikoanalyse bekommen. Und da ging es darum, kann man daraus mit KI Sicherheitshinweise erstellen? Und ich habe dann gesagt, ja klar, guck mal die Risikoanalyse an und dann guck mal an, was die Redakteure daraus gemacht haben. Und es waren mustergültige Sicherheitshinweise. Aber es stand so wenig in der Risikoanalyse drin, dass beim besten Willen konnte man da nichts mit Künstlicher Intelligenz machen, sondern das ging nur, weil die Redakteure ein wahnsinnig gutes Produktverständnis hatten und auch noch die Normen im Hinterkopf hatten, die dafür notwendig waren. Da war eben die Information nicht in diesem Input versteckt, sondern im Kontextwissen. Und das ist so speziell, das ist natürlich auch nicht im Large Language Model vorhanden.
SO: In so einer Anwendung oder in so einem Anwendungsfall siehst du dann überhaupt keine Möglichkeit für KI?
SG: Also zumindest nicht für ein generisches Large Language Model. Also sowas wie ChatGPT oder Claude, die sind da chancenlos. Es gibt die Möglichkeit in der KI, diese Modelle nochmal zu spezialisieren. Man kann die ja mit kontextspezifischen Texten feintunen. Aber ob wir da im Normalfall ausreichend Texte haben, weiß man im Moment noch nicht so. Da gibt es die ersten Experimente. Aber denken wir nochmal zurück an die Wassermoleküle. Für einen Eisberg oder schon für einen Schneemann brauchen wir ziemlich viele davon. Also heute ist letztendlich so, welche Hilfsmittel unter den Gesichtspunkten dann auch, also Feintuning ist echt teuer. Also Kosten. Dauert lange. Also auch Performance ist ein Thema. Und wie praktikabel ist das? Haben wir Trainingsdaten? Also unter diesen ganzen Aspekten, was da jetzt wirklich der goldene Weg ist, um so ein generisches Large Language Model für die Textarbeit für sehr spezifische Kontexte brauchbar zu machen, ist einfach noch unklar. Weiß man heute einfach nicht.
SO: Kannst du denn heute schon sehen oder voraussehen, wie die generative KI die technische Dokumentation verändern wird oder muss?
SG: Ich finde das noch echt mehr so einen Blick in die Kristallkugel. Also das ist noch gar nicht so einfach einzuschätzen, welche Use Cases jetzt für den Einsatz von KI in der technischen Dokumentation vielversprechend sind. In der Regel hast du eine Aufgabenstellung, wo ein textueller Input nach einer bestimmten Maßgabe in einen textuellen Output transformiert werden soll. Und früher galt da Garbage in, Garbage out. Die Large Language Models nach meiner Meinung verändern diese Gleichung nachhaltig. Input, den wir mangels Informationsdichte früher nicht automatisch verarbeiten konnten, den können wir jetzt durch universelles Kontextwissen anreichern, so anreichern, dass er verarbeitbar wird. Fehlende Informationen können nicht ergänzt werden. Das haben wir ja jetzt besprochen. Aber diese unausgesprochenen Annahmen, in der Tat, die können wir mit reinpacken. 6Und das hilft uns in der technischen Dokumentation an vielen Stellen, weil sich eine gute technische Dokumentation ja unter anderem dadurch von einer schlechten unterscheidet, dass weniger Annahmen notwendig sind, um den Text zu verstehen, beziehungsweise auch, wenn man ihn maschinell verarbeiten will. Und deshalb finde ich Informationsverdichtung statt Wissenschöpfung für mich so eine Art Ockhamsches Messer. Ich betrachte mir die Ausgabenstellung. Geht es jetzt einfach nur darum, verborgene Informationen sichtbar zu machen oder sie in eine andere Form zu bringen, dann ist das einfach ein guter Kandidat für den Einsatz von generativer KI. Oder geht es jetzt eher darum, durch den Rückgriff auf andere Informationsquellen die Informationen zu veredeln? Dann wird es schon schwieriger. Wenn ich jetzt diese Informationen, diese anderen Informationen in einem Knowledge Graphen habe, wenn die dort schon aufgeschlüsselt sind, dann kann ich ja die Informationen explizit vor der Übergabe an das Large Language Model anreichern. Und dann geht das auch wieder. Wenn aber die Informationen, zum Beispiel das inhärente Produktwissen im Kopf des Redakteurs ist, wie bei der Risikoanalyse meiner Kundin, dann hat das Large Language Model einfach keine Chance. Das wird da keinen Mehrwert generieren. Dann muss man eventuell nochmal überlegen, kann man die Aufgabenstellung noch irgendwie aufteilen? Vielleicht gibt es einen Teil, wo dieses Wissen nicht notwendig ist und ich habe einen vor- oder nachgelagerten Prozessschritt, wo ich mit der KI was optimieren kann. Und ich finde, da wird in der Zukunft die Musik spielen. Diese Kunst, das Machbare vom Unmachbaren zu unterscheiden, und das wird eher so eine Art Ingenieurskunst sein, das wird in den kommenden Jahren der Faktor sein, der entscheidet, ob die generative KI mir einen Nutzen stiftet oder nicht.
SO: Und was glaubst du, mehr oder nicht?
SG: Ich glaube, wir werden das raustüfteln. Aber es wird viel länger dauern, als wir das glauben.
SO: Ja, also ich glaube, das stimmt.
Und also vielen Dank, Sebastian. Das sind wirklich ganz interessante Perspektiven und ich freue mich auf unsere nächste Diskussion, wenn in so zwei Wochen oder drei Monaten was ganz Neues da in der KI ist und wir uns noch mal darüber unterhalten müssen, ja, was können wir denn heute machen oder jetzt machen? Also vielen Dank und wir sehen uns …
SG: … demnächst irgendwo auf diesem Planeten.
SO: Irgendwo.
SG: Vielen Dank für die Einladung. Mach’s gut, Sarah.
SO: Ja, und vielen Dank an die Zuhörenden, besonders zum ersten Mal im deutschen Raum. Weitere Informationen sind bei scriptorium.com verfügbar. Thank you for listening to the Content Strategy Experts Podcast, brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links. Thank you for listening to the Content Strategy Experts Podcast, brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
The post Strategien für KI in der technischen Dokumentation (podcast, Deutsche version) appeared first on Scriptorium.
In episode 168 of The Content Strategy Experts podcast, Sarah O’Keefe and special guest Leslie Farinella, Chief Strategy Officer at Xyleme, discuss the challenges facing content operations for learning content, insights for navigating information silos, and recommendations for successful enterprise-wide collaboration.
Why do we still have these silos of content? Back to what you said, Sarah, if we’re thinking about the learner experience, the learner doesn’t distinguish between classroom, e-learning, looking something up, or going to technical documentation. They just know, “I gotta get my job done. I need to perform. I need to know what I’m doing.”
— Leslie Farinella
Related links:
LinkedIn:
Transcript:
Sarah O’Keefe: Welcome to the Content Strategy Experts Podcast brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize, and distribute content in an efficient way. In this episode, we talk about the challenges that organizations face with content operations for learning. Hey, everyone. I’m Sarah O ‘Keefe, and today I’m delighted to welcome Leslie Farinella of Xyleme to the podcast. Xyleme, as you may know, has recently been acquired by MadCap Software, which also owns Flare and IXIASOFT. So Leslie, welcome. Tell us about yourself and your role at Xyleme/MadCap.
Leslie Farinella: Hi, Sarah. I’m super excited to be here today. So I’ve been at Xyleme for over the last eight years. Actually, prior to that, I was in the learning content space, but on the business side, helping organizations to drive performance within their workforce. And I realized that, you know what, if we wanted to scale, we were going to have to bring technology to help solve this problem. So I got really excited. So I jumped over to the product side. And since I’ve been at Xyleme, I’ve pretty much covered almost all of the roles, ending up with my last role being the chief strategy officer.
SO: And so here we are. And I think you’re probably the perfect person to talk to about this topic where we’re getting a lot of interest all of a sudden. Well, from my point of view, maybe not from your point of view, but from my point of view, we’re getting a lot of interest in content operations for learning content.
LF: Yeah.
SO: So people are asking questions like, if I have overlapping content between my tech comm content and my learning content, why, you know, why can’t I combine those in some efficient way as opposed to what I’m doing now, which is this terrible copy and paste or worse rewrite without, you know, people ever talking to each other. But also we’re hearing from learning organizations that don’t actually have what I would consider to be tech comm content who need a more mature content workflow. So they’re asking questions like, “How can I develop learning better, faster, cheaper?” So what does that look like on your side of the fence?
LF: We absolutely hear the exact same thing, and I think it’s only gonna get worse because if we think about the root cause and think about what’s really driving this conversation and what’s making this conversation escalate is the speed of change and the need to drive agility within the organizations so organizations have to adapt faster than they have before which means people have to learn new skills new mindsets and new behaviors faster than before and which means inevitably they have to learn on the go, which means that performance support and tech comms is part of that learning. And as you and I know, cause I know you and I’ve had past conversations breaking down that silo between tech comm and learning is gonna be essential to driving that agility that organizations need to change. And that’s why they’re feeling the pressure.
SO: And so what does that look like? You know, Xyleme in particular is an enterprise learning content management system, which perhaps I should have said in the intro. What does it look like when people start considering something, you know, a solution like that? What’s the executive-level argument for that?
LF: Speed, agility, cohesiveness, learner experience. And I think that what we all have to remember is when you’re buying something like a CCMS or an LCMS, you know, component content management system or learning content management system, they’re kind of flip sides of the same coin, but they also need to work together. And I think that is the change in mindset we need in the industry is that if you think about learning, you have formal learning. I take a course. Usually, I’m a novice. I need some scaffolding. But the majority of the learning, once I kind of get my initial scaffolding happens by experience. It happens by solving problems. And inevitably that means looking stuff up. So it means going back to the documentation because no one’s going to go to the LMS to go flip to halfway through the e-learning course to look something up that’s just very painful. So what I hear from the top executive level is how do we make that whole system work together? How do we consider it from a job performance perspective and moving people from a novice to proficiency across that entire spectrum, which is learning and tech comm. And that’s where this idea that we have these separate systems and these separate processes really start to get in our way. And I think that’s where the opportunity is, is to see how do we break down that silo and how do we think about how these technologies can work better together or maybe even collapse into a single tech stack.
SO: Yeah, and I think that, you know, a big part of this is if you if you go back 20, 25, 30 years, we had classroom training, basically, and we had paper like books or maybe a cheat sheet or a job aid, but, you know, some sort of a printout. And so the distinction between I’m going to go to a class and learn the things and they’re going to give me a like a student guide or a textbook or, you know, but something, some sort of supporting material. And then there’s my reference library of books. And today, we still have that. I mean, we still have that distinction between class, e-learning, blended learning, and online, and all the rest of it. But there’s that bucket. And then there’s that bucket of, OK, there’s this other book adjacent or book-derived stuff. However, today, it’s all sitting on the same website. And so now as an end user, as a software user or learner, I show up on your website, your product website, and like, hey, I’m blocked on this task that I need to do. I’ve got a job I need to get done. I don’t know how to do it. And I just frankly don’t care. I just want you to give me the answer. Now, I don’t care where it lives. Not my problem. But give me the answer and give it to me better, faster, cheaper. And then, you know, infamously, we always say, “Don’t ship your org chart,” except we always do. So what does it look like to start to foster these connections and improve the integration or the interaction or the, I’m struggling for words, which is probably a symptom of this problem. What does it look like to start fostering those connections to improve the end-user experience?
LF: I think what you just said, end user experience. You know, we have to map that user experience. And I think that’s one thing that the learning side has done well is they’ve invested in the LMS, the learning experience platforms. Everybody still complains about them, but at least they were, you know, investing and trying and those experiences are getting better and better because there’s more competition in the market. People are coming up with other tools. They’re bringing, you know, more algorithms into play and then, you know, AI will play into that as well. But what they haven’t done well is content management and structured authoring. So Xyleme is an LCMS. So actually, you know, there obviously are people on the learning space that have bought into, we need to bring structured authoring into learning. But it’s not the majority. A lot of organizations still haven’t done that. And I think that once you start to bring what tech docs already knew is, you know, you’ve got to standardize to personalize. You’ve got to bring in, you know, you got to think modular. You’ve got to be able to standardize against your terminology. And then you can start to scale. That’s something that the learning side, you know, needs to learn and that’s something that the LCMS, which is the counterpart to component content management brings in and we’ve tailored it to the audience of instructional designers and learners to help with that transition. But the base ideas underneath the technology are the same. One of the interesting things in the acquisition with MadCap and the IXIA team when we started comparing products, we’re like, we do that, we do that too. yeah, we’ve always wanted to do that. You guys already have it, but we all, we realized very quickly we were solving the same problem and getting to the same result. We made them make different design decisions along the way, but we were solving the same problem and the fundamental premise underneath both technologies were the same, which then starts to beg the question, why aren’t they combined? Like, why are we still have these silos of content? If we’re thinking about back to what you said, Sarah, the learner experience, the learner doesn’t distinguish between classroom e-learning, looking something up, going to technical documentation. They just know, I gotta get my job done. I need to perform. I need to know what I’m doing. And I wanna, you know, ready myself for my next role in promotion within the organization. And they have expectations on their performance. And so how do we look at that and understand it needs to be more cohesive and how can we as both the tech docs and the learning industry break down that silo with the content but also the experience itself to make that more cohesive.
SO: Yeah, I think one thing that’s sometimes overlooked in this is that the default emotional state of a person who is looking for information is something like frustration and anger, right? Because they’re not reading for fun. They’re not going to class for fun. I mean, probably. They are doing it because this class or this learning piece or this piece of information that I don’t have, is standing between me and getting the job done. I need to generate a pivot table and I don’t know how, so show me how to do it. I need to do a thing and until I do the thing, I can’t progress in my tasks of the day and so I’m annoyed. And we’ve set aside knowledge base for the purpose of this conversation, but knowledge base usually is even worse because usually that’s something like my system crashed, why? So they’re not just annoyed, they’re like incandescently angry because something is not working. Okay, so, and I think you said something really interesting in there about how the learning experience, you know, the downstream user experience for learning, there’s been a lot of work put into that and comparatively less on the tech comm side. I’m not saying all tech comm is bad or anything like that, but when you look at some of the work that’s been done in producing really sophisticated e-learning and really interesting learning experiences on a platform of some sort, and then conversely on the back end, tech comm has done a huge amount of work around reuse and efficiency and automated formatting and automated delivery and multi-channel and all these things, which I think there’s some advantages there and there’s some things there that I think the learning world can can probably leverage and you know vice versa so, you know, while you and I are ruling the world and we’re fixing all of this, you know, we can’t fix the integration next week and I mean I’ve been complaining about that for a while, but you know that is a legitimately difficult hard problem. But what are some of the steps that we can take as content creators, whether learning or tech comm, to start thinking about this sort of more unified approach to enabling content? What can we do there? And what are some of those first steps?
LF: I think the first step is we have to collaborate. I think the first step is, do you even know the people in your tech comm team or your learning team? Like, do you even know who they are? So, you know, I think there’s a conversation. I think the second one is to have a shared goal of we want to create a better user experience. Like, you know, an agreement that that is a goal that’s worth, you know, pursuing. And I think your managers, your VPs, definitely your leadership would agree that is. and then I think mapping that out. Like what would that learning experience look like? What’s your utopia? And then break it down, right? You can’t boil the ocean. You have to kind of have a plan. You know, what does the vision look like? And then what’s the first step? Start small. Like what’s the first step in the vision? And I think you and I talked earlier, a procedure is a procedure. Like there’s no magic. There’s some obvious low hanging fruit here as far as, you know, where you can share content that drives efficiency and makes sense. And then to the learner, you’re not coming up with different terminology. We all know the brain loves consistency because it helps with retrieval within the brain. So when I see the same picture, when I see the same example, when I see the same terms, it unlocks memory within the brain. It helps with retrieval. So, you know, we can make it easier for people. But then also looking at, you know, can we put some of that technical documentation and embed it in the learning content in the LXP so it’s easy to find where are people going? Maybe it is to the tech doc portal. Maybe we put it in both places and we figure out single source, right? We can update it in both places. We can keep that in sync, but really understanding and mapping that learner, the end user experience for performance and working together and understanding that we both have something to contribute to the conversation. I think, you know, to your point, tech comms can learn a little bit about experience and how people, you know, retrieve information, but the learning team can definitely learn a lot about structured authoring content management from the tech comm team. So bring those expertise together, which is 80% business and 20% technology. I mean, the first part is, you know, you just got to agree and set your goals and then figure out what’s the best technical solution that will drive those goals. And I would even argue, take it small. Like, you know, do experiment, see what works, what doesn’t work, trial and error. Cause I wish I had the whole answer. I don’t. I think it definitely is a problem that we need to invest in solve. And the only way we’re going to solve is through experimentation. But I also don’t think there’s a one size fit all answer either. I think each organization has legacy tech stacks. We all know we can’t just throw out the tech stack we have, you know, we have different competing business priorities. We have different skills and capacity within our teams. So do what you can. And I think that sometimes people throw up their hands and they do nothing because they think it’s too big. But you got to start small and you got to start somewhere. And step one is go have lunch with the people, maybe a virtual lunch these days, but on the other side, like talk to them, share you guys. At the end of the day, you have a common goal of driving performance within your organization. You have a shared mission. That’s where I would start.
SO: Yeah, I like figure out who your counterpart is. That seems like a reasonable achievable goal. And then, yeah, and then back, you know, work from there. What can you, you know, can you reach consensus on shared terminology? Because, you know, I mean, never mind unified content authoring, that would be lovely but can we agree to call a car seat a car seat and not sometimes a safety seat and sometimes a baby seat and sometimes a something else? Because that would be like a really good start.
LF: Yeah. And the more things you can agree on, the more things you’ll find to agree on. So start with that. How do I share, you know, the procedures? How do I keep stuff in sync? You know, how do I even reduce the time between, you know, product release, the technical documentation and any formal training that needs to have? How do I make sure, you know, how can I generate FAQs? There’s a lot of things that you could brainstorm that you could do together, which then it fosters that collaboration.
SO: Mm-hmm.
LF: And then figure out what are the technical barriers I’m hitting. And I’ll say this as a vendor and then talk to the vendor and say, hey, here’s the business problem we need to solve. We think it’s a market problem. We think there’s value with you. We need you to fix this. Like we need you to be able to integrate these systems. And again, from the vendor side, if you make a good business case and you can show that the market in general, it’s good for the market, you can probably push their roadmap. But if you don’t speak up, if you haven’t tried, how do you know what those barriers are? So how do you know what to push? So just because it doesn’t do it today doesn’t mean you can’t get a solution.
SO: And the bigger you are, the more we would like you to kindly contact the vendors because…
LF: Yeah, the more money you have, the more clout you have. But I’ll be honest with you. As far as our roadmap on the vendor side, many times anyone who’s willing to experiment and to put some skin in the game as far as a real use case and to work together, I would rather build features and integrations based on real-world examples and real-world data than a theoretical PowerPoint we may put together from a nice product feature. And I know most product vendors are the same.
SO: I’ll have leverage.
LF: So partnering with your tech vendors and coming to them with, this is the business problem we want to solve. This is why we think it’s worth solving. And partnering with them to solve it is going to help to break down some of those technical silos. And the good news is on the MadCap side, because we do have the IXIA, we have the Flare, we have the Xyleme, that’s our vision is to how do we bring it together? It’s not gonna happen overnight because we all have. Like I said earlier, we all kind of made different design decisions which aren’t necessarily all compatible right this second, but we’re figuring out how do we make them more compatible? Like who’s got to kind of give up what and how can we make these work together? And because they’re all in our product stack, we have a vested interest in doing that. And honestly, we’re looking for customers, if there’s any MadCap customers out there listening, we’re looking for customers who want to partner with us on that journey and help us to figure out the answer because we know the problem pretty clear. We know some of the answer, but the only way you truly find the answer is by partnering with customers to figure it out.
SO: So I have to ask you about AI because we’re not allowed to do podcasts without asking about AI anymore. Tell me a little bit about your take on AI in the content universe that you live in.
LF: Yeah, I can, you know, there’s so much buzz about AI generation and the large language models and chat GPT. And I think because it kind of like wowed us all and it made the news and not that there’s not some efficiencies to be found there around summarization and descriptions. Cause one thing we know is that. The quality of the descriptions that go in the LMS and the LXP really drive retrieval or people being able to find something. And humans actually write really bad descriptions. AI does a better job of writing descriptions that search can find. So I think there’s something there there. But what really excites me is AI retrieval. Being able to match content to a person, like to me specifically based on my role context, where am I searching from? Am I searching from within Salesforce? Am I searching within my technical app? You know, what gives some idea of what I might be my problem that I’m having? Maybe even send error messages in what’s my region? What are my current skills? What are my skill gaps that would get me the information that I need faster and just the information that I need, not, you know, the 20 page document and now I’ve got to go find page five of 20. The great thing about AI retrieval is it can just bring me topic seven out of 70 and just bring that back to me. So I think that really solving that retrieval problem is huge, that time to an answer. The second one is AI data. AI’s been doing a lot with data. It’s not new news as far as data classification, looking at patterns. But if we think about if our common mission is performance and people being able to do their job, understanding holistically somebody’s journey from novice to proficiency and expert and what really drove those. And we might find out it’s all on the managers. And I argue a lot of it is their manager, you know, their manager and their coaching and had nothing to do, nothing against the audiences we’re talking to, but had very little to do with the learning team and the comp team. It had a lot to do with the managers, but understanding that and how we contribute into that journey will help us to understand what’s really important. And the nice thing about AI is it can bring in a lot more data and look at patterns that are much more sophisticated than us as humans can. We can’t hold that many variables in our head at one time. So I’m excited about AI to bring personalization of content, matching people to content, helping us better understand the value of the content we write and what drives that value of the content so that we can drive those best practices because I think we guess a lot and we have our ideas, we might be surprised at the answer. And then yeah, I mean, AI generation does definitely have a role. I don’t want to say it doesn’t have any, but honestly, it doesn’t excite me quite as much as the other two.
SO: Well, I, you know, I sort of lost interest early on when I asked chat GPT to generate a bio for me and it informed me that I had a PhD, which I mean, cool, but no. So, you know, it, it just, there were a couple of other things like that. It, and, and you said this earlier, you know, it is, it is important in our context to get the information right. And the thing that ChatGPT and the other generators don’t necessarily do is accuracy. They generate plausible content. But if we care about getting it right, cut the blue wire, then the red wire. no, wait, wrong. So it’s important to have this stuff be correct. And that’s the thing that GenAI really struggles with because it doesn’t really have a concept of correct.
LF: Yep. I think that’s where we are sitting on a gold mine with our content, because if you think of RAD, which is Retrieval Augmented Generation, which is the current, you know, leading answer as far as proprietary information that must be correct, it really is about retrieval. And what it does is it points to your vetted database of content. Well, where are those? By LCMS? CCMS? Gold mines of content, because, well, AI can do unstructured content. So not saying that you can’t give it a PDF or PowerPoint, whatever unstructured content. If you give it structured content, it’s like rocket fuel. It’s just easier, it’s better. And if you tagged that content, even if you use AI to help tag it, but if you’ve tagged that content, now that retrieval accuracy goes up exponentially, so we are sitting on rocket fuel. If you’ve already invested in an LCMS or a CCMS, you’re doing structured authoring, you have rocket fuel to drive your AI solution. And you don’t need AI to do it. It’s not that it has AI inherently in our databases. It’s just that we have the content that’s going to generate those AI agents and help to generate those answers and drive those right answers. And one of the key things when you think about proprietary content in these rag systems is the attribution. So it will provide a response. It’s not totally, it may summarize it, but it’s not rewriting it to the, it’s not just making it up like Jack GPT would, where it’s writing it from scratch. It is retrieving it. It may summarize it, but it gives an attribution. It tells me where it got that content from so as the person looking at it, I can decide whether I trust that source and I can verify it. So if it’s red wire versus blue wire and the wrong blue wire, something’s gonna blow up, I can go check the source and say, okay, yes, I trust that source and I’m gonna cut the blue wire.
SO: And on that cheery and I think hopefully explosive note, that seems like that sounds like a good place to wrap it up. Leslie, thank you so much for coming on. I hope we’ll continue this conversation and drive some positive change and some new cool integration and cooperation possibilities. And with that, thank you for listening to the content strategy experts podcast brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
The post Overcoming operational challenges for learning content, feat. Leslie Farinella (podcast) appeared first on Scriptorium.
In episode 167 of The Content Strategy Experts Podcast, Sarah O’Keefe, Alan Pringle, and Bill Swallow discuss the difficulties organizations encounter when they try to create a unified content experience for their end users.
AP: Technical content, your tech content or product content, wants to convey knowledge so the user or reader can do whatever thing that they need to do. Learning content is about improving performance. And with your knowledge base content, it’s when, “I need to solve this very specific problem.” So those are the distinctions that I see among those three types.
SO: Okay, and from a customer point of view, what does this mean?
AP: Well, in reality, I don’t think the customers care. They want the information available, and they want it in the formats they want it in. And also, they want the right information so they can either get that thing done, improve their performance, or solve a specific problem.
Related links:
LinkedIn:
Transcript:
Sarah O’Keefe: Welcome to the Content Strategy Experts Podcast brought to you by Scriptorium. Since 1997, Scriptorium has helped companies manage, structure, organize, and distribute content in an efficient way. In this episode, we talk about the challenges of content operations across the enterprise. Hi, everyone. I’m Sarah O ‘Keefe. I’m here today with two partners in crime, Alan Pringle and Bill Swallow.
Alan Pringle: Hello.
Bill Swallow: Howdy.
SO: That first one was Alan, and the second one was Bill. Good luck with that everybody. So I have a big topic today. I want to focus on the intersection of technical content, learning content, and knowledge base content. And Alan, what’s the difference between the three?
AP: Okay, let me see if I can break this down, because I’m sure people have very strong opinions about this, and we may hear about them, but this is how I’m gonna break them down. Technical content, your tech content or product content, wants to convey knowledge so the user or reader can do whatever thing that they need to do. Learning content is about improving performance. And with your knowledge base content, it’s when, “I need to solve this very specific problem.” So those are the distinctions that I see among those three types.
SO: Okay, and from a customer point of view, what does this mean?
AP: Well, in reality, I don’t think the customers care. They want the information available, and they want it in the formats they want it in. And also, they want the right information so they can either get that thing done, improve their performance, or solve a specific problem.
At the end of the day, they don’t care what department or what group wrote it. They just want it, and they want it then and there.
SO: So this enabling content is like, here’s how you can get your job done. Here’s how you can do the thing you need to do and move on with your day so that you can generate the report or write the thing or do the code or whatever it is. They need this content so that they can do the thing. So then, we have all these silos, right? We have technical content in its silo, and we have learning content, and we have knowledge base content, and then we have tools optimized for each of those use cases or for each of those sets of authors. So, now, is this a bad thing from a content perspective?
AP: That is possibly the worst leading question I’ve ever heard on this podcast. The worst.
SO: Okay, I’ll rephrase.
AP: You don’t need to, but of course it’s bad. It is very, very bad. And the reason that it’s bad is because there is so much overlap in this content. Roughly half of technical content, there’s overlap because you’re both dealing with tasks. You’re dealing with tasks.
SO: Procedures, yeah.
AP: Yeah, step-by-step instructions. So you don’t need two sets, one for each group. Why are we doing this? And when I say we, I mean the entire content world because folks, we are. You’ve also got overlap between your technical/product content and your support content. Troubleshooting instructions, Q&A’s on avoiding very specific problems. Same exact stuff, yet again, we’re often maintaining two different versions of that information. So there you go.
SO: So what we want is shared content, right? But we can’t do it because the tools aren’t there. Is that right? It is right. I know it’s right.
AP: Well, yeah, I mean, but it’s not just the tools. It’s the people that write this content because they often have, shall we say, fairly strong opinions that they need a special flavor or they need a special twist on the content. So it’s tools, but they’re also these opinions that the content creators have that inform these problems as well, I think.
SO: Okay, and then Bill turning to infrastructure, what does this look like from an infrastructure point of view as opposed to a, I mean, shared content is kind of an infrastructure problem, but I think there’s additional ones. What does that look like?
BS: Goodie, it’s my turn. Yeah, so shared infrastructure is a big one, you know, getting everyone to kind of play in that, you know, that same sandbox. But there are other things that really need to be shared across the enterprise.
AP: Hmm.
BS: So things like taxonomy, you know, making sure everyone is aligning, you know, with the same terms, the same way of categorizing things, the same way of organizing information, their localization workflow, and even the vendors that they’re using, you know, that they’re all going under the same process so that they get a uniform result back. And then, you know, design systems, making sure that there’s a federated search in place, and making sure that anything that’s being produced for customer or reader consumption has the same unified experience might be a little bit different from content type to content type from delivery platform to platform, but in the end, you have a unified experience so that people aren’t relearning how to engage with your content in every context you produce it.
SO: So from an infrastructure point of view, what does it look like today to set up shared infrastructure? Can you tell us a little bit about the software tools that are available that allow you to do all of this in a unified way?
BS: You know, it’s too big of a list. And that list is basically consumed with things like duct tape, string, Bondo, you name it. There is nothing out there that will give you a unified experience across the enterprise for every content type out there. Right now, it does not exist.
SO: So on the authoring side, I think there’s some unified delivery kinds of integrations. But I think we’re talking about the back end.
BS: We’re starting to see a lot with portals that are starting to collect a lot of information and present them all in one unified space, or at least provide one universal point of access for that content. And we are seeing some tools start to reach out and kind of embrace other traditional content silos. So things like, for example, being able to do develop all of your content in one single place and be able to push to a same branded let’s say knowledge base and documentation portal But I don’t think that there’s anything out there that really grabs everything and says okay. We’re going to do you know manuals We’re going to do other tech content. We’re gonna do web-based references we’re going to do knowledge base articles and tech support guides and training materials, you name it, and produce it all from one source to all these different things. So we have a lot of duct tape and string in place at the moment.
SO: And point solutions like, hey, we’re optimized for learning. Hey, we’re optimized for KB. We’re optimized for tech com. And I mean, it does seem to me that there’s a really big disconnect between what our clients are asking for and what the market has available because our clients are asking for slash demanding unified authoring solutions. And like you said, we have duct tape and string to offer them.
BS: Mm-hmm.
SO: So, okay, so if let’s step back a little bit and say you don’t do this. So you take the departmental approach and you push your tech com content through your tech com solution to the web and you push your KB to a KB article database thing and you have learning content which goes to a learning management system and therefore some sort of a learning platform. What happens when those are not unified? And I’ll, Alan, I’ll start with you. What happens with that if they’re not unified from a content point of view?
AP: Well, the terminology you’re using is not gonna be consistent or often is not consistent across your content types. For example, you go to your knowledge base, and you find a support article that uses a certain term for some widget. And then later on, when you try to search for the name of that widget and some other content, like on the product side of the content, and that product side uses a slightly different term, you’re not gonna get a search result because they’re using different terminology for what is really the same exact thing. So you have that lack of alignment. And the same thing is true, for example, with your product content and your training content. You may have slightly different how-tos or tasks to accomplish the same exact thing. So you’ve got those contradictions there in how to do things, in terminology, and you’re not getting a consistent voice at all in what you are presenting to your customers because of these departmental silos that we were talking about.
SO: And then Bill, on the infrastructure side, what do you see there in terms of problems that surface?
BS: A lot of it comes around or comes back to user experience, you know, because all these tools have very, I guess, a targeted focus. They have a lot of custom feature sets that are built just for that type of content. And a lot of the more generalized features are built out in slightly different ways. And you don’t have a lot of, or you may have a lot of ability to customize, but generally they’re not customized for whatever reason. Either it’s too difficult, no time, one group likes it one way, one group likes it another way. So you have these disjointed user experiences, just going from one area of the website to another. So being able to navigate manuals online to going over to a knowledge base and seeing a completely different interface and not knowing how to navigate it out of the box. So you’re now asking your customers to learn how to use your content in addition to having to use your content to find information in the first place.
SO: So we’re, I mean, we’re doing a lot of complaining, right?
BS: It’s fun to complain.
SO: It is fun to complain. But I guess as consultants, our job is, in fact, to take on the complaints and then come up with a solution. So in the absence of the magic system that does all the things, you know, one thing we’ve seen a lot of customers do is make that compromise where they say, okay, we’re gonna take the thing that’s optimized for A, but we’re gonna use it for A and B even though it’s suboptimal for B. And of course, then the B people feel like B-class citizens, which isn’t great, but enterprise-wide, it’s very, very helpful. On the taxonomy side of things and some of these others, it does feel as though you can build that over the top and then just integrate it into all the other tools and push it down onto those. So I guess that part’s okay-ish. But I mean, what does this look like? And I guess my question to both of you is what’s the solution here? I mean, what’s the path forward and where do we want this to land? I mean, for our personal gratification, but mostly for our customers. What do our customers need the solution to look like so that this is, infamously, the line is you don’t want to ship your org chart, right? You don’t want your website to be a reflection of your org chart at a level that is recognizable to the end customer because, again, they don’t care. So what are some of the solutions here? What are some of the options that people have?
AP: Well, I think one thing you’ve got to do and step back and realize this is not just a tech problem. Now, the tech problem is very real in regard to the silos because you’re using different sets of tools, especially on the authoring side and the content creation side to get things done. But I think all of those content creators need to step back and think a little more globally across the company and not just about this is just for my people, this is just for me. Need to take a bigger step back and think, how can other departments potentially use this information? And then you start getting into tech, how can they actually reuse it? And that’s where you slip away from more culture to tech and how it can enable that sharing and that reuse.
BS: Mm-hmm.
SO: And some of the things like terminology is a good example. If you standardize terminology, you can ask people to follow that across all their systems, right? Like use this term and not that term does not require a unified, you know, content management solution. It’s just a writing practice. And you could layer the terminology management over the top of multiple systems. I mean, it’s more expensive, but you could. Bill, do you have any hope?
BS: Mm-hmm. There’s always hope. You know, we’re starting to get there and especially as, you know, at least systems are starting to be able to somewhat talk to each other via API. So there is a way to share information across. It’s not a, it’s not what I would call anything remotely close to, you know, intelligent reuse, because you’re still duplicating content from one system to another. But at least if you’re consistent about writing in one place, and pushing it out where it needs to go via those hooks, then it’s better than authoring everything separately.
AP: You still have a single source of truth in what you’re talking about and that’s the end goal or it should be the end goal for this problem.
BS: Exactly.
SO: And it might be helpful to look at single source of truth less as the process of doing a task, like how do I change my password in a database, right? There’s a four-step or a two-step or one-step procedure, but there’s a procedure and there’s only one way of doing it. And I think a lot of times the ultimate solution to this is to do some, essentially, forensics on where does that information originate. And if it originates here and I am a downstream user of that information, that’s fine. Just don’t ever modify it. Always go back to the source of the information and modify it at the beginning and then flow it back through. The problem that arises is that in a scenario where flow it back through involves manual processes or copy and pasting, it’s always going to fail because people fail, right? People don’t do the thing. And so you get those inconsistencies and now there’s four different ways of changing your password. One in the tech docs, one in the learning and like two in the knowledge base. And now what do you do with it?
AP: And you’ve got frustration among your users because they’re getting inconsistent information. And then you’ve got frustration with your content creators because they constantly feel like they’re having to go hunt for something or it is not worth my time to go find it.
BS: Mm-hmm.
AP: I’m just gonna copy and paste and then they forget to update one of the umpteen versions they have. And then they’re stuck in this constant go go go process so it’s bad on both sides of the content equation for your content creators and the people who were consuming that content as well.
SO: Okay, well, this is super encouraging. And with those helpful words from Bill and Alan and maybe me, but mostly them, I will leave you to it. And so with that, thank you for listening to the Content Strategy Experts podcast brought to you by Scriptorium. For more information, visit scriptorium.com or check the show notes for relevant links.
The post The challenges of content operations across the enterprise (podcast) appeared first on Scriptorium.
The podcast currently has 181 episodes available.